-
Notifications
You must be signed in to change notification settings - Fork 6
Tutorial
This page is a short (and incomplete) tutorial to the basics of Husk. You are encouraged to try out the examples yourself, either by following the links to the online interpreter at Try It Online (courtesy of Dennis from PPCG), or downloading and using the offline interpreter.
How do I type the weird symbols?
Husk has an online keyboard at https://abrudz.github.io/lb/husk. To type husk code in an editor, it is recommended to use the verbose ASCII format.
Our first example is Husk's version of the famous "hello world" program.
"Hello, World!"
Try it! This program exemplifies several things about the syntax and semantics of Husk:
- String literals are defined by double quotes. If you remove the second quote, you'll notice that the program still works. In general, the terminal quote can be omitted at end of each line, and the interpreter will add it implicitly.
- A Husk program can be a single expression. If it is a string (or evaluates to a string), it is automatically printed to STDOUT at execution, without quotes.
Since Husk is a golfing language, there is a shorter way to define string literals: compressed strings.
A compressed string uses ¨ as a delimiter instead of ", and allows us to replace word fragments with non-ASCII characters.
A compressed "hello world" program looks like this:
¨H◄⁰,ω]!
Try it!
The offline interpreter comes with a tool for finding optimal compressed representations, compressString.hs.
You can also use it online.
As a second example, let's try to add two and two together. Here is the Husk program that performs this calculation.
+2 2
Try it!
You'll see that this program prints 4, as is expected.
This program is again a single expression whose value is printed to STDOUT at execution, but it is a compound expression, not a literal.
The + is a function that takes two numbers as arguments and returns their sum.
In simple expressions like this, Husk uses Polish notation: the arguments go after the function without any separator (whitespace can be used to separate two numbers, if needed).
Thus the expression 3*(2+2) is *3+2 2 in Husk; * is of course multiplication.
Now, try to remove one of the 2s (and the separating whitespace) from the "2+2" program, or add an extra number, and see what happens.
The interpreter should now give you the error "Could not infer valid type for program".
Husk is a strongly typed language, and the type of the addition function is TNum -> TNum -> TNum.
This means that it takes a number, then another number, and returns a third number.
If you try to pass it one or three numbers, the interpreter doesn't know how to run the program and gives up.
We will later see it is possible to pass only one argument to + in a way that makes sense, but for now, let's agree that it takes exactly two arguments.
The basic arithmetical functions are
-
+for addition, -
-for subtraction, -
*for multiplication, -
/for exact division, -
÷for integer division, -
%for remainder, -
^for power, -
⌉⌋for LCM and GCD, -
⌈⌊for ceiling and floor, -
yYfor maximum and minimum (which work on any two values of the same type, not just numbers), -
_for negation (there are no negative literals), -
\for reciprocal, -
±for signum, -
afor absolute value.
There are also some shortcuts for common operations: ←→ for decrement and increment, ½D for halve and double, and √□ for square root and square.
You can experiment with combining these functions into longer expressions.
Most arithmetic operations, including / and \, return rational numbers if possible, and floats if not.
As usual, floating point literals use the period ., like in 2.25.
When using these functions, you may notice one oddity: -/÷%^ take their arguments in the "wrong" order.
The expression -4 2 evaluates to -2 instead of 2.
There is a reason for this, but for now, it's just something you'll have to keep in mind.
Now, instead of having a program that always outputs 4, we'd like to make one that adds 2 to a given number.
In Husk, inputs are taken as command line arguments to the interpreter.
The most basic way of accessing the inputs explicitly is via the even superscript digits ⁰²⁴⁶⁸.
The easiest way to turn an expression into a program is to replace one or more literals with ⁰ and add one command-line argument, whose value is passed to the ⁰s.
+2⁰
Try it! If you want to have more than one input, use the next even superscript. This program adds together two numbers.
+⁰²
Try it!
Somewhat counter-intuitively, ⁰ refers to the last command-line argument that a superscript is used for, ² to the second-to-last, and so on, up to the number of inputs that you use the superscripts with.
If you skip an even superscript, for example by using ² but not ⁰, the corresponding argument is ignored.
The odd-numbered superscripts ¹³⁵⁷⁹ also refer to command-line arguments, but the semantics are slightly different.
If you use ¹ instead of ⁰, then an extra copy of the argument is implicitly appended to the program when it is parsed.
For example, the programs +⁰⁰ and +¹ are equivalent, as the second one is transformed into +¹¹.
In general, superscripts 2k and 2k+1 can be used to refer to the same command-line argument, and the latter causes a copy of it to be inserted at the end of the line; we refer to them as a block of two variables.
If both variables of a block are used in the program, the semantics of the even-numbered variable changes: ⁰F becomes shorthand for `F¹, where `F is F with flipped arguments.
If neither of them are used, but variables of some higher block are, then the corresponding argument is ignored.
As a moderately complex example, consider the program
+⁷*¹/⁰-
The highest two-variable block used in this program is 6,7, so we consider the four blocks 0,1, 2,3, 4,5, 6,7.
This means that the program takes four inputs; call them x01, x23, x45 and x67.
Both variables ⁰ and ¹ of the first block are present, so ⁰- becomes `-¹.
Next, copies of the odd-numbered variables are inserted to the end of the line, which happens in reverse numerical order:
+⁷*¹/`-¹⁷¹
Since the arguments are also taken in reverse numerical order, this program computes the 4-argument function x67, x45, x23, x01 -> ((x01 / (x01 - x67)) * x01) + x67.
You may think that superscript digits are a clunky way to take input, and you wouldn't be wrong. Fortunately, Husk also supports implicit inputs. Take as an example this program.
+2
Try it!
We tried to run this program before, but got a type error.
If we add a command-line argument, it now adds 2 to it and prints the result.
You can think of this as the interpreter implicitly inserting the input at the end of the program before evaluating it.
Another way of seeing this is that besides a single expression, a Husk program can be a single function.
On execution, it is called on the input values, and the result is implicitly printed.
The expression +2 is indeed a function: it is the binary function + partially applied to the value 2, which makes it a one-argument function that takes a number n and returns 2 + n.
Partial application is a major part of Husk programming, and thus it happens implicitly.
Partially applied functions can be chained perform more than one operation to an input. As with expressions, the evaluation happend from right to left. This program multiplies its input by 3 and then adds 1.
+1*3
You can take more than one implicit input, and chain the results. This program computes the average (halved sum) of two numbers.
/2+
Try it! Partial application is the main reason for the strange argument order of some functions: dividing an unknown value by a constant is a much more common operation than dividing a constant by an unknown value.
It is also possible to chain binary functions that are not partially applied.
In general, if you chain two functions f and g, it means that the result value of g is used as the first unused argument of f, and the other arguments are taken afterwards.
This program takes three numbers p q r, and computes the value abs(p+q)*r.
*a+
Try it!
The first two inputs are given to +, the rightmost function.
The result of that is given to a, which computes the absolute value.
The result of that is passed as the first argument of *, and the second argument is the third input.
This is consistent with the interpretation that p q r are tacked to the end of the program and the resulting expression is evaluated.
It is also possible to combine explicit and implicit inputs.
Remember that if you use an explicit even-numbered superscript to refer to an input value, then it is not added as an implicit argument for the program.
For example, this program takes four arguments p q r s and computes (r+q)*s - p.
-²*+⁰
Try it!
It works by substituting p and q to the superscripts ² and ⁰ (remember the reverse order or superscripts), and after that adding r and s as implicit arguments.
Besides numbers and strings, Husk also has lists.
In fact, a string is just a list of characters.
The syntax for a list as a program argument is [1,2,3], so square brackets and commas.
List literals cannot be used in source code, however.
Here are some basic functions for working with lists:
-
:tacks a value to the beginning or end of a list, -
+concatenates two lists, -
;wraps a value into a singleton list, -
←→extract the first and last element of a list, -
thdrop the first and last element of a list, -
↑↓take and drop a given number of elements from a list, -
!gives the element at a specified (1-based modular) index, -
€gives the index of an element in a list, or0if it's not an element, -
↔reverses a list, -
Lgives the length of a list, -
ḣŀṡṫgive different kinds of ranges for a single number, -
…gives the range from one number to another.
For example, this program takes a list of numbers x, a number n, and another number k, and finds the index of k in the first n elements of x.
€↑
Try it! Try to analyze how implicit arguments and chaining work in this example.
At this point you should note that some of these functions were already used for arithmetic, like +.
Such functions are overloaded, which means they have several different behaviors that depend on the types of their arguments.
Given two numbers, + adds them together, but given two lists, it concatenates them.
What's not important at this point, but will become very important later on, is that the correct behavior of each function depends on the entire program and the inputs: the interpreter chooses the behavior of the functions so that the program makes sense as a whole, and only then executes it.
Returning to lists, it's important to note that Husk lists are homogeneous, which means that they can't contain values of different types.
You can't insert a character into a list of numbers, or a list of numbers into a list of lists of strings.
Even an empty list always has a specific type; the type of a list that contains elements of some type a is denoted [a].
This is how functions like : know how to behave when given two lists as inputs.
If the left argument is a list that could contain the right argument, it's tacked to the end; if the right argument is a list that could contain the left argument, it's tacked to the beginning.
The functions ↑↓!€ can take their arguments in either order: since one is a list and the other is a number or list element, the correct behavior is always clear.
These basic functions and others like them can only get you so far. Eventually you'll want to build programs with flow control: conditional statements, loops and such. In Husk and many other functional programming languages, this is a bit different than in imperative languages, as flow control is achieved with higher order functions. These are functions that take other functions as arguments and/or give functions as return values.
Let's take map, which is assigned to the symbol m, as an example.
It's a function of type (a -> b) -> [a] -> [b], which means that it takes as arguments a function of type a -> b (which takes an a and returns a b) and a list containing elements of type a, and returns a list of elements of type b.
Its behavior is to apply the function to each element of the input list, and return the list of results.
This program multiplies every element in a list by four.
m*4
Try it!
The *4 is a partial application of * to 4, so a function that multiplies by four, and m takes this function as its first argument.
Why is *4 not applied directly to the input?
Because this would not result in a well-typed program: m expects a function as its first argument and a list as its second argument, and the only way to satisfy this is to treat *4 as function application, and the whole program as the application of m to it.
What if you want to multiply by 4 and then add 3 to each list element?
The program m+3*4 does not work, since m tries to grab +3 as its first argument.
You can skip the following more detailed explanation on first reading.
Internally, since Husk uses Polish notation, programs are parsed from left to right: m+3*4 is equivalent to (((m+)3)*)4, and each pair of two tokens ab is treated as either an application of function a to value b, or the generalized composition of the functions a and b.
The word "chaining" that we used above means exactly generalized composition.
For the program m*4, the interpreter deduces the form (m {COMPOSE} *) {APPLY} 4.
When the composition m* is applied to the number 4, the function * takes 4 as its first argument and becomes the partially applied function *4, and this function is given to m as its first argument.
For m+3*4 the interpreter cannot deduce a form that makes sense.
To fix the error, we use another higher order function, o, which is explicit composition of two functions.
mo+3*4
Try it!
In this program, the arguments of o are the partially applied functions +3 and *4.
They are composed, or chained, together into a single function, which is then given to m.
The built-ins ȯ and ö work like o, but they compose three and four functions instead of two.
If you want to compose more than four functions (or just prefer more readable programs), you can use parentheses () for grouping expressions.
The above program is equivalent to m(+3*4).
Like with the double quotes, a trailing parenthesis ) can be omitted at the end of a line.
Husk has several higher order functions that loop over lists in one way or another.
We already saw m, which maps a given function over the elements of a list.
A variant of that is ṁ, which sums or concatenates the list after mapping a function over it, depending on whether it is a list of numbers or lists.
For example, ṁa gives the sum of the absolute values of a list (Try it!).
If you have two lists and want to apply a binary function on them element-by-element, use zip, which is assigned to z.
For example, z+ takes two lists and adds their corresponding elements together.
Try it!
If one of the lists is shorter than the other, the extraneous elements are simply discarded.
There is also a version of zip, assigned to ż, that keeps the extraneous elements intact, but you can only use it if the two lists and the result all have the same type.
Sometimes you'll also want to add a single number to all elements of a list.
You can of course use m for that, but sometimes it's shorter to use M or Ṁ.
The first of these, M, takes a binary function, a list and a value, and applies the function to each list element and the value.
Try it!
The function Ṁ works similarly, but takes the value before the list.
What if you want to apply a binary function to all possible pairs of elements drawn from two lists?
For that, Husk provides table (Ṫ) and mix (×), both of which take a binary function and two lists.
table constructs a "multiplication table" by the function (Try it!), whereas mix just enumerates the results in a single flat list (Try it!).
The function filter (f) walks through the list and deletes its elements according to a condition.
It takes a function and a list, and deletes those elements for which the function returns a falsy value.
At this point, I should mention that some Husk values have a property called truthiness: they can be either truthy or falsy.
The most common cases are numbers, characters and lists.
The only falsy number is 0; all other numbers are truthy.
As a convention, most Husk functions that are designed to test a condition use nonnegative integers as return values.
A character is falsy if it's a whitespace character.
This includes spaces, tabs and such.
A list is falsy precisely when it is empty.
Here are some examples of condition functions:
-
=≠test for equality and inequality. -
<>≤≥are ordering conditions. Note that their arguments are flipped:<2 3returns0. -
€returns the 1-based index of a value in a list, as we saw before. It can also be used to test whether a value occurs in a list. -
¦takes two numbers and tests whether the first divides the second. -
Etests whether all elements of a list are equal. -
εtests whether a number has absolute value at most 1, or whether a list contains exactly one element. -
I, the identity function, can be used as a truthiness test. -
¬tests whether a value is falsy.
For example, the program f¦3 takes a list of numbers and keeps those that are multiples of 3.
Try it!
The function # works like f, but counts the number of truthy results instead of collecting them into a list.
If # is given a value and a list, it counts the number of occurrences of that value.
Finally, Husk has several versions of the fold/reduce function, which intuitively combines the elements of a list into a single value.
foldl (F) and foldr (Ḟ) reduce a list from the left and right, respectively.
Their arguments are a binary function, an initial value and a list.
For example, F-0 applied to [1,2,4] gives 3 (which is 4-(2-(1-0))), while Ḟ-0 applied to it gives -7 (which is ((0-4)-2)-1); note that Husk's - takes its arguments in the reversed order.
The variants foldl1 and foldr1, which are bound to the same characters, work without the initial values.
Cumulative reduce, or "scan" in Haskell terminology, is achieved with the functions G and Ġ, which correspond to F and Ḟ but collect all intermediate results into a list.
They can also be used with or without initial values.
There are a few built-in folds and scans for common special cases:
-
sum(Σ) adds together a list of numbers. -
concat(Σ) concatenates a list of lists. -
prod(Π) gives the product of a list of numbers. -
cartes(Π) is the Cartesian product of a list of lists. -
cumsum(∫) is cumulative sum of a list of numbers. -
cumcat(∫) is cumulative concatenation of a list of lists. -
maxlandminl(▲▼) give maximum and minimum elements of a list.
Mapping a simple function like *4 over a list is easy: just call m on it like we did a couple of sections back.
But what if you have a more complex function, one that needs several superscripts when defined as a program?
Here you have three options: line functions, which are like separate definitions of named functions in more conventional programming languages; lambda functions, which are their anonymous inline versions; and combinators, which are higher order functions that glue small functions together into bigger ones.
Combinators are the most efficient of the three in terms of code size, but require a bit more mental gymnastics and are not quite as flexible.
In this section we focus on the first two.
Let's start with line functions.
A Husk program can consist of more than one line of code, each of which defines a function.
The first line is the main function, and the rest are auxiliary.
When the program is executed, the main function is called on the program's arguments and the result is printed.
The auxiliary functions (and the main function) can be explicitly called using the subscript digits ₀-₉.
The indexing starts from 0, so that ₀ refers to the main function.
Every line follows the same rules about the superscript arguments that we explained for the main function: ⁰¹ refer to the last argument, which is also implicitly inserted after the line unless you use ⁰ but not ¹, and so on.
Let's take an example.
Suppose we have defined a function that takes a number n and returns n*(n+5).
As a full Husk program it could look like this:
*¹+5
We take the implicit argument on the right, add 5, and then multiply it by the argument, which is bound to ¹.
Now we'd like to map this function over a list, so let's demote it to an auxiliary line and do the mapping on the main line:
m₁
*¹+5
Auxiliary line functions are especially handy when you want to apply the same function in several different places.
For example, suppose you have a list of lists of numbers, like [[1,2,3],[3,2],[5,0,3,2]] and you want to take the average of each list and then the average of those.
There is a built-in function A that computes the average of a list, but we'll use an auxiliary definition for the example's sake:
₁m₁
/L¹Σ
Try it!
The average of a list is its sum (Σ) divided (/) by its length (L), which the second line implements.
On the first line, we map this function over the list of lists and then call it again on the result.
The second method of defining complex functions is with lambdas, or anonymous inline functions.
Basically, whenever you have a line function with one argument that you refer to using ⁰ and/or ¹, and you call that function from only one place, you can wrap it in λ) and move it to the call site.
For example, the second program of this section is equivalent to this:
mλ*¹+5)
What happens is that λ...) denotes an unnamed function that takes one argument and returns the value of the expression ..., inside which ⁰¹ can be used to refer to the argument.
If you don't use either superscript, then the argument is ignored and ... is returned as-is.
In particular, the argument is not implicitly inserted after the expression unless you use ¹ in it.
Lambdas and line functions play similar roles, but each has their own pros and cons when it comes to convenience and source code size.
First (and most obviously), line functions can be used more than once, while lambdas are just inline expressions.
Second, a lambda can be shorter than an inline function, because the trailing ) can be omitted at end of line.
In fact, our program above could be mλ*¹+5.
Third, you can directly refer to the arguments of the surrounding line function, and of any enclosing lambdas, inside a lambda environment.
Let's elaborate on the last point.
Suppose we take a list of numbers, and want to multiply each element with the length of the list and then add the number of occurrences of that element.
For example, [1,2,3,3] would become [5,9,14,14]; each 3 becomes 3*4+2=14 since the list has four elements, two of which equal 3.
We can implement this function as follows:
mλ+#¹³*L³
Try it!
What happens here is that inside a lambda environment, every existing superscript gets "bumped up" by 2 to make room for ⁰¹, which now refer to the lambda argument.
Hence the ⁰¹ that referred to the list become ²³ inside the lambda.
The familiar rules still hold: since we used the odd superscript ³, the list is taken as an implicit argument of the entire program, and similarly for the lambda argument.
There are shorthand commands μ and ξ for doubly and triply nested lambdas (which are in practice lambdas that take two or three arguments instead of one).
The code μ...) is equivalent to λλ...)) and ξ...) to λλλ...))); the closing parentheses can again be omitted at end of line.
For example, μ*+¹³-) computes the two-argument function that takes x and y and returns (y+x)*(y-x) (remember the argument order of -).
Finally, it's good to remember that both lambdas and line functions can take additional arguments, which are implicitly inserted on their right.
If you know Haskell, this is not surprising: a lambda may return a function, which can then be applied to a value.
For example, λ+¹*) takes two numbers, x and y, and returns y*x + x.
The superscript ¹ refers to x.
Now that you know how to use line functions and lambdas, it's time to learn about the feature that lets you bypass them: combinators. A combinator is a function that takes one or more functions as arguments and combines them into a new function in a generic way, which means that it doesn't care exactly what kind of data the functions process. Here is a list of the combinators available in Husk, roughly organized by complexity of common use cases.
I found a function I could use, but it takes its arguments in the wrong order.
Then use `, which flips the first two arguments of a function.
For example, maybe you want to subtract something from 10 rather than subtracting 10.
This is done with `-10.
One neat application that sometimes comes up is to flip the arguments of a higher order function, like f when you have a fixed list that you'd like to filter by a variable condition.
For example, `fḣ10¦ filters the numbers in the range [1,2,..,10] based on whether they are integer multiples of the argument.
I have a two-argument function, and I want to use the same value as both arguments.
Then use ´.
For example, maybe you want n repetitions of a number n, like [2,2] for 2 and [4,4,4,4] for 4.
This is done with ´R.
I want to apply several functions one after another.
Then use oȯö, which compose two, three and four functions respectively (for five or more, use parentheses or a line function).
Basically, oAB is equivalent to (AB), ȯABC is equivalent to (ABC) and öABCD is equivalent to (ABCD).
They are also equivalent to just putting the functions on a separate line and calling the line function (assuming they contain no superscripts).
This works even if some of the functions take two or more arguments: ȯ*±- gives the sign of the difference of two numbers and multiplies it by a third number.
We have already briefly looked at these combinators in an earlier section, and they're very common in golfed Husk.
The combinator · is a variant of o for when you want to modify the second argument of a binary function.
If A is binary, then oAB takes two values x and y, and applies A to Bx and y.
Sometimes you want to apply it to x and By instead, and this can be done with ·AB.
I want to "reuse" a value: modify it and combine with the original using some binary function.
Then use S or Ṡ.
SAB is equivalent to the line function A¹B, which applies A to the argument x and Bx, and ṠAB is equivalent to AB¹.
B is often a composition of several functions.
As a simple example that we have already used before, mS*+5 maps the function n -> n*(n+5) over a list.
I want to do two different things to a value and combine the results.
Then use §.
§ABC is equivalent to AB¹C: it applies B and C to its argument and feeds both results to A.
For example, §/LΣ computes the average of a list.
This combinator has a second mode too: if B and C are two-argument functions, then §ABC takes two arguments as well, feeds both to B and C and the results to A.
For example, §e+* computes [x+y,x*y] from x and y.
I want to do the same thing to two values and combine the results.
Then use ¤.
¤AB takes two values, applies B to both and feeds the results to A.
For example, ¤=Σ checks if two lists sum to the same value, and ¤-□ computes y^2-x^2 from x and y.
I want to do different things to two values and combine the results.
Then use ~.
~ABC takes two values x and y, and applies A to Bx and Cy.
It seems rare in practice that ~ is the only shortest way to implement something, but sometimes it is.