# Theory IO

theory IO
imports
Main
begin

text ‹
›

subsection‹Real World›
text ‹
We model the real world with a fake type.

WARNING:
Using low-level commands such as ⬚‹typedecl› instead of high-level ⬚‹datatype› is dangerous.
We explicitly use a ⬚‹typedecl› instead of a ⬚‹datatype› because we never want to instantiate
the world. We don't need a constructor, we just need the type.

The following models an arbitrary type we cannot reason about.
Don't reason about the complete world! Only write down some assumptions about parts of the world.
›
typedecl real_world (‹🌐›)

text‹
For examples, see 🗏‹HelloWorld_Proof.thy›.
In said theory, we model ▩‹STDIN› and ▩‹STDOUT› as parts of the world and describe how this part
of the wold can be affected. We don't model the rest of the world. This allows us to reason about
▩‹STDIN› and ▩‹STDOUT› as part of the world, but nothing more.
›

text ‹
The set of all functions which take a \<^typ>‹🌐› and return an \<^typ>‹'α› and a \<^typ>‹🌐›.

The rough idea of all IO functions is the following: You are given the world in its current state.
You can do whatever you like to the world. You can produce some value of type \<^typ>‹'α› and you
have to return the modified world.

For example, the ▩‹main› function is Haskell does not produce a value, therefore, ▩‹main› in
Haskell is of type ▩‹IO ()›. Another example in Haskell is ▩‹getLine›, which returns ▩‹String›.
It's type in Haskell is ▩‹IO String›. All those functions may also modify the state of the world.
›

typedef 'α io = "UNIV :: (🌐 ⇒ 'α × 🌐) set"
proof -
show "∃x. x ∈ UNIV" by simp
qed

text ‹
Related Work:
∗‹Programming TLS in Isabelle/HOL› by Andreas Lochbihler and Marc Züst uses a partial function
(‹⇀›).
⬚‹
typedecl real_world
typedef 'α io = "UNIV :: (🌐 ⇀ 'α × 🌐) set" by simp
›
We use a total function. This implies the dangerous assumption that all IO functions are total
(i.e., terminate).
›

text ‹
The ⬚‹typedef› above gives us some convenient definitions.
Since the model of \<^typ>‹'α io› is just a mode, those definitions should not end up in generated
code.
›
term Abs_io ― ‹Takes a \<^typ>‹(🌐 ⇒ 'α × 🌐)› and abstracts it to an \<^typ>‹'α io›.›
term Rep_io ― ‹Unpacks an \<^typ>‹'α io› to a \<^typ>‹(🌐 ⇒ 'α × 🌐)››

text‹
Within an \<^typ>‹'α io› context, execute \<^term>‹action⇩1› and \<^term>‹action⇩2› sequentially.
The world is passed through and potentially modified by each action.
›
definition bind :: "'α io ⇒ ('α ⇒ 'β io) ⇒ 'β io" where [code del]:
"bind action⇩1 action⇩2 = Abs_io (λworld⇩0.
let (a, world⇩1) = (Rep_io action⇩1) world⇩0;
(b, world⇩2) = (Rep_io (action⇩2 a)) world⇩1
in (b, world⇩2))"

text ‹
In Haskell, the definition for ▩‹bind› (▩‹>>=›) is:
▩‹
(>>=) :: IO a -> (a -> IO b) -> IO b
(action1 >>= action2) world0 =
let (a, world1) = action1 world0
(b, world2) = action2 a world1
in (b, world2)
›
›

hide_const (open) bind

lemma "bind (foo :: 'α io) (λa. bar a) = foo ⤜ (λa. bar a)"
by simp

definition return :: "'α ⇒ 'α io" where [code del]:
"return a ≡ Abs_io (λworld. (a, world))"

hide_const (open) return

text ‹
In Haskell, the definition for ▩‹return› is::
▩‹
return :: a -> IO a
return a world0  =  (a, world0)
›
›

lemma left_id:
fixes f :: "'α ⇒ 'β io" ― ‹Make sure we use our \<^const>‹IO.bind›.›
shows "(IO.return a ⤜ f) = f a"
by(simp add: return_def IO.bind_def Abs_io_inverse Rep_io_inverse)

lemma right_id:
fixes m :: "'α io" ― ‹Make sure we use our \<^const>‹IO.bind›.›
shows "(m ⤜ IO.return) = m"
by(simp add: return_def IO.bind_def Abs_io_inverse Rep_io_inverse)

lemma bind_assoc:
fixes m :: "'α io" ― ‹Make sure we use our \<^const>‹IO.bind›.›
shows "((m ⤜ f) ⤜ g) = (m ⤜ (λx. f x ⤜ g))"
by(simp add: IO.bind_def Abs_io_inverse Abs_io_inject fun_eq_iff split: prod.splits)

subsection‹Code Generator Setup›
text ‹
We don't expose our \<^const>‹IO.bind› definition to code.
We use the built-in definitions of the target language (e.g., Haskell, SML).
›
code_printing constant IO.bind ⇀ (Haskell) "_ >>= _"
and (SML) "bind"
| constant IO.return ⇀ (Haskell) "return"
and (SML) "(() => _)"

text‹SML does not come with a bind function. We just define it (hopefully correct).›
code_printing code_module Bind ⇀ (SML) ‹
fun bind x f () = f (x ()) ();
›
code_reserved SML bind return

text‹
Make sure the code generator does not try to define \<^typ>‹'α io› by itself, but always uses
the one of the target language.
For Haskell, this is the fully qualified Prelude.IO.
For SML, we wrap it in a nullary function.
›
code_printing type_constructor io ⇀ (Haskell) "Prelude.IO _"
and (SML) "unit -> _"

text‹
In Isabelle, a \<^typ>‹string› is just a type synonym for \<^typ>‹char list›.
When translating a \<^typ>‹string› to Haskell, Isabelle does not use Haskell's ▩‹String› or
▩‹[Prelude.Char]›. Instead, Isabelle serializes its own
▩‹data Char = Char Bool Bool Bool Bool Bool Bool Bool Bool›.
The resulting code will look just ugly.

To use the native strings of Haskell, we use the Isabelle type \<^typ>‹String.literal›.
This gets translated to a Haskell ▩‹String›.

A string literal in Isabelle is created with \<^term>‹STR ''foo'' :: String.literal›.
›

text‹
We define IO functions in Isabelle without implementation.
For a proof in Isabelle, we will only describe their externally observable properties.
For code generation, we map those functions to the corresponding function of the target language.

Our assumption is that our description in Isabelle corresponds to the real behavior of those
functions in the respective target language.

We use ⬚‹axiomatization› instead of ⬚‹consts› to axiomatically define that those functions exist,
but there is no implementation of them. This makes sure that we have to explicitly write down all
our assumptions about their behavior. Currently, no assumptions (apart from their type) can be
›
axiomatization
println :: "String.literal ⇒ unit io" and
getLine :: "String.literal io"

text ‹A Haskell module named ▩‹StdIO› which just implements ▩‹println› and ▩‹getLine›.›
code_printing code_module StdIO ⇀ (Haskell) ‹
module StdIO (println, getLine) where
import qualified Prelude (putStrLn, getLine)
println = Prelude.putStrLn
getLine = Prelude.getLine
›                              and (SML) ‹
(* Newline behavior in SML is odd.*)
fun println s () = TextIO.print (s ^ "\n");
fun getLine () = case (TextIO.inputLine TextIO.stdIn) of
SOME s => String.substring (s, 0, String.size s - 1)
| NONE => raise Fail "getLine";
›

code_reserved SML println print getLine TextIO

text‹
When the code generator sees the functions \<^const>‹println› or \<^const>‹getLine›, we tell it to use
our language-specific implementation.
›
code_printing constant println ⇀ (Haskell) "StdIO.println"
and (SML) "println"
| constant getLine ⇀ (Haskell) "StdIO.getLine"
and (SML) "getLine"

lemma "bind (println (STR ''foo''))
(λ_.  println (STR ''bar'')) =
println (STR ''foo'') ⤜ (λ_. println (STR ''bar''))"
by simp
lemma "do { _ ← println (STR ''foo'');
println (STR ''bar'')} =
println (STR ''foo'') ⪢ (println (STR ''bar''))"
by simp

subsection‹Modelling Running an \<^typ>‹'α io› Function›
text‹
Apply some function \<^term>‹iofun :: 'α io› to a specific world and return the new world
›
definition exec :: "'α io ⇒ 🌐 ⇒ 🌐" where
"exec iofun world = snd (Rep_io iofun world)"

text‹Similar, but only get the result.›
definition eval :: "'α io ⇒ 🌐 ⇒ 'α" where
"eval iofun world = fst (Rep_io iofun world)"

text‹
Essentially, \<^const>‹exec› and \<^const>‹eval› extract the payload \<^typ>‹'α› and \<^typ>‹🌐›
when executing an \<^typ>‹'α io›.
›
lemma "Abs_io (λworld. (eval iofun world, exec iofun world)) = iofun"

lemma exec_Abs_io: "exec (Abs_io f) world = snd (f world)"

lemma exec_then:
"exec (io⇩1 ⪢ io⇩2) world = exec io⇩2 (exec io⇩1 world)"
and eval_then:
"eval (io⇩1 ⪢ io⇩2) world = eval io⇩2 (exec io⇩1 world)"
by (simp_all add: exec_def eval_def bind_def Abs_io_inverse split_beta)

lemma exec_bind:
"exec (io⇩1 ⤜ io⇩2) world = exec (io⇩2 (eval io⇩1 world)) (exec io⇩1 world)"
and eval_bind:
"eval (io⇩1 ⤜ io⇩2) world = eval (io⇩2 (eval io⇩1 world)) (exec io⇩1 world)"
by(simp_all add: exec_def eval_def bind_def Abs_io_inverse split_beta)

lemma exec_return:
"exec (IO.return a) world = world"
and
"eval (IO.return a) world = a"
by (simp_all add: exec_def eval_def Abs_io_inverse return_def)

end

# Theory HelloWorld

theory HelloWorld
imports IO
begin

section‹Hello, World!›

text‹
The idea of a \<^term>‹main :: unit io› function is that, upon start of your program, you will be
handed a value of type \<^typ>‹🌐›. You can pass this world through your code and modify it.
Be careful with the \<^typ>‹🌐›, it's the only one we have.
›

text‹The main function, defined in Isabelle. It should have the right type in Haskell.›
definition main :: "unit io" where
"main ≡ do {
_ ← println (STR ''Hello World! What is your name?'');
name ← getLine;
println (STR ''Hello, '' + name + STR ''!'')
}"

section‹Generating Code›

text‹Checking that the generated code compiles.›

ML_val‹Isabelle_System.bash "echo ${ISABELLE_TMP} >${ISABELLE_TMP}/self"›
text‹
During the build of this session, the code generated in the following subsections will be
written to
›
text_raw‹\verbatiminput{${ISABELLE_TMP}/self}› subsection‹Haskell› export_code main in Haskell file "$ISABELLE_TMP/exported_hs"

text‹The generated helper module 🗏‹$ISABELLE_TMP/exported_hs/StdIO.hs› is shown below.› text_raw‹\verbatiminput{$ISABELLE_TMP/exported_hs/StdIO.hs}›

text‹The generated main file 🗏‹$ISABELLE_TMP/exported_hs/HelloWorld.hs› is shown below.› text_raw‹\verbatiminput{$ISABELLE_TMP/exported_hs/HelloWorld.hs}›

subsection‹SML›

export_code main in SML file "$ISABELLE_TMP/exported.sml" text‹The generated SML code in 🗏‹$ISABELLE_TMP/exported.sml› is shown below.›
text_raw‹\verbatiminput{$ISABELLE_TMP/exported.sml}› end # Theory HelloWorld_Proof theory HelloWorld_Proof imports HelloWorld begin section‹Correctness› subsection‹Modeling Input and Output› text‹ With the appropriate assumptions about \<^const>‹println› and \<^const>‹getLine›, we can even prove something. We summarize our model about input and output in the assumptions of a ⬚‹locale›. › locale io_stdio = ― ‹We model ▩‹STDIN› and ▩‹STDOUT› as part of the \<^typ>‹🌐›. Note that we know nothing about \<^typ>‹🌐›, we just model that we can find ▩‹STDIN› and ▩‹STDOUT› somewhere in there.› fixes stdout_of::"🌐 ⇒ string list" and stdin_of::"🌐 ⇒ string list" ― ‹Assumptions about ▩‹STDIN›: Calling \<^const>‹println› appends to the end of ▩‹STDOUT› and \<^const>‹getLine› does not change anything.› assumes stdout_of_println[simp]: "stdout_of (exec (println str) world) = stdout_of world@[String.explode str]" and stdout_of_getLine[simp]: "stdout_of (exec getLine world) = stdout_of world" ― ‹Assumptions about ▩‹STDIN›: Calling \<^const>‹println› does not change anything and \<^const>‹getLine› removes the first element from the ▩‹STDIN› stream.› and stdin_of_println[simp]: "stdin_of (exec (println str) world) = stdin_of world" and stdin_of_getLine: "stdin_of world = inp#stdin ⟹ stdin_of (exec getLine world) = stdin ∧ eval getLine world = String.implode inp" begin end subsection‹Correctness of Hello World› text‹Correctness of \<^const>‹main›: If ▩‹STDOUT› is initially empty and only \<^term>‹''corny''› will be typed into ▩‹STDIN›, then the program will output: \<^term>‹[''Hello World! What is your name?'', ''Hello, corny!'']›. › theorem (in io_stdio) assumes stdout: "stdout_of world = []" and stdin: "stdin_of world = [''corny'']" shows "stdout_of (exec main world) = [''Hello World! What is your name?'', ''Hello, corny!'']" proof - let ?world1="exec (println (STR ''Hello World! What is your name?'')) world" have stdout_world2: "literal.explode STR ''Hello World! What is your name?'' = ''Hello World! What is your name?''" by code_simp from stdin_of_getLine[where stdin="[]", OF stdin] have stdin_world2: "eval getLine ?world1 = String.implode ''corny''" by (simp add: stdin_of_getLine stdin) show ?thesis unfolding main_def apply(simp add: exec_bind) apply(simp add: stdout) apply(simp add: stdout_world2 stdin_world2) apply(simp add: plus_literal.rep_eq) apply code_simp done qed end # Theory RunningCodeFromIsabelle theory RunningCodeFromIsabelle imports HelloWorld begin section‹Running the Generated Code inside Isabelle› (*Maintainer note: We invoke the generated code ON PURPOSE from bash to demonstrate how to use the generated code from outside Isabelle and make sure the code runs.*) text‹ Usually, one would use ⬚‹export_code› to generate code. Here, we want to write the code to a temp directory and execute it right afterwards inside Isablle, so we invoke the code generator directly from Isabelle/ML. › subsection‹Haskell› ML‹ val (files, _) = Code_Target.produce_code @{context} false [@{const_name main}] "Haskell" "Main" NONE [] val target = File.tmp_path (Path.basic ("export" ^ serial_string ())) val ghc = getenv "ISABELLE_GHC"; val cmd = "cd " ^ Path.implode target ^ " && " ^ Bash.string ghc ^ " Main.hs && " ^ "( echo 'Cyber Cat 42' | ./Main )"; Isabelle_System.make_directory target; List.app (fn ([file], content) => File.write (target + Path.basic file) content) files; val exitcode = if ghc <> "" then Isabelle_System.bash cmd else (writeln "not running Haskell, because$ISABELLE_GHC is not set."; 0);

if exitcode <> 0 then
raise (Fail ("example Haskell code did not run as expected, " ^
"exit code was " ^ (Int.toString exitcode)))
else ()
›

subsection‹SML›

ML‹

val ([(_, content)], _) =
Code_Target.produce_code @{context} false [@{const_name main}] "SML" "HelloWorld" NONE []

val target = File.tmp_path (Path.basic ("export" ^ serial_string ()))
val file = target + Path.basic "main.ML"

val cmd =
"echo 'Super Goat 2000' | " ^
"\"\${POLYML_EXE?}\" --use " ^ Path.implode file ^
" --eval 'HelloWorld.main ()'";

Isabelle_System.make_directory target;
File.write file content;

val exitcode = Isabelle_System.bash cmd;

if exitcode <> 0 then
raise (Fail ("example SML code did not run as expected, " ^
"exit code was " ^ (Int.toString exitcode)))
else ()
›

end