This week I got a 2014 Dell Chromebook 11 as a secondary laptop for carrying around to classes. It’s now discontinued, but it fills that purpose perfectly. However, there’s some things you have to be aware of when installing Arch Linux on it.
The steps are nearly the same as what the Arch wiki page recommends. However, there’s some outdated info there:
Afterwards, set up Arch like normal. Some tips:
echi_pci
in GRUB: https://wiki.archlinux.org/index.php/Chrome_OS_devices#With_kernel_parametersThat’s pretty much all I’ve run into so far. Other quick notes:
lm_sensors
and fancontrol
don’t detect them.xbacklight
can’t adjust the backlight for some reason, but you can do it manually with /sys/class/backlight
. There is some kernel parameter that needs to be set.Overall, for the $290 (tax+shipping) I paid, I’m happy for what I got—a light, long-lasting portable laptop that is still reasonably powerful (ULV Haswell i3, 4GB RAM), for programming tasks. The only task remaining is to find a lightweight web browser.
comments powered by Disqus ]]>So over the summer I've been interning at RetailMeNot in Austin, TX, working on a (hopefully) soon-to-be open-sourced project written primarily in Go, Google’s pet language. In sum, I find Go to be a pragmatic language: one designed for getting things done quickly, but not a radical language that pushes the envelope of language design.
Quick things: Go is a statically typed language with a pared-down, C-like syntax. It depends on a runtime and is garbage collected, and though it syntactically has features reminiscent of object-oriented languages, it lacks inheritance, and polymorphism is done purely through interfaces. The standard hello world:
package main
import "fmt"
func main() {
fmt.Println("Hello, world!")
}
Instead of objects, we have struct
s, and can define functions that take one as a special parameter.
type Vec2 struct {
x int
y int
}
func (v Vec2) Add(other Vec2) Vec2 {
return Vec2{
x: v.x + other.x,
y: v.y + other.y,
}
}
Instead of polymorphism and inheritance, we can define interface
s. Interfaces in Go are structurally typed: if a struct
has all the necessary methods, it automatically satisfies the interface.
type Vector interface {
Mag2() float
}
func (v Vec2) Mag2() float {
return v.x*v.x + v.y*v.y
}
// Vec2 now implements Vector
Types are declared differently from C, with the type following the variable. The language does technically have semicolons, but the compiler inserts them for you. There aren't private/public declarations like in C++ and Java—instead, a name is exported if and only if it begins with a capital letter. There are no variant types/ADTs and no enumerations.
The community and tooling matter as much as the language design, and Go has a strict set of style guidelines enforced by a tool called gofmt
. So you won’t be debating tabs or spaces (tabs) and you won’t wonder how to indent a block (gofmt
does that for you). This makes learning the language simpler, as (along with golint
and govet
) gofmt
helps enforce idioms and style.
Easy Concurrency: Goroutines make it easy to run something in the background and communicate with that task. Channels make it easy to synchronize, communicate, and share data without having shared mutable state. The built-in race checker tells you when you’re doing something dangerous. How it works:
go func()
runs func
on a goroutine. It won’t block your code.chan T
is a channel of type T
. make(chan T)
will create such a channel. If c
is a channel of type T
, then c <- t
will put the value t
onto the channel (blocking1 if there is no space on the channel) and v := <- c
will try to read a value from the channel and put it in v
(blocking if there is no value). The blocking helps you synchronize different goroutines. There’s no need for locks or other synchronization primitives to use these, though they are available in package sync
.Goroutines don’t necessarily correspond to threads; as the name implies, they’re more akin to goroutines. In Go 1.4 and below, by default all goroutines share one thread.
Consistency: gofmt
, golint
, and govet
mean that all code is consistent. The three tools warn about style guide violations and remind you to document your code (in accordance with the style guide).
Easy Deployment: The main Go compilers produce statically linked binaries. Cross-compilation is as easy as setting a few environment variables during the build.
Resource Management: defer
makes error handling simple by registering a function to be executed when the current function returns. Open a file? Just defer file.Close()
. No need for try-catch-finally.
No ADTs: Pattern matching and ADTs, or even just enumerations, would be extremely useful, just to be able to express the concept of having one of a finite set of mutually exclusive values.
Error Handling: Error handling is done C-style: check the return value. Multiple return values make this a bit less crufty, but the sad fact is that 50% of your code2 is gonna look like this:
thingIWant, err := doSomething()
if err != nil {
return err
}
defer thingIWant.Close()
This gets extremely tedious, and leads into…
Type System: The pattern above could easily be abstracted by something like an option
or Either
monad (OCaml, Haskell), or with a try!
macro like in Rust. But it isn't. You're stuck writing all this repetitive code. Go is halfway there, but lacks the abstraction to make error handling less tedious.
Default values: In Go, if you initialize a variable without an initial value, they take on default zero values. This is true for any value, including structs. This can be convenient, but I think the convenience is outweighed by the fact that sometimes, you want to know when a value actually was omitted. You can use pointers and nil
for this, but then you have to deal with nil
. The default values are also somewhat inconsistent: an empty map (hashtable) value, for instance, is nil
, which isn’t usable and will cause a runtime crash if you try to index it—which rather defeats the purpose (I believe) of having default values in the first place (so you don’t have to initialize everything).
Go makes writing servers and other code that needs to be concurrent easier, and the strict tooling and familiar syntax makes learning the language easy. The design seems oriented towards speed of development and ease of deployment. Overall, I’ve enjoyed working with Go, but it’s not a language that I’ll continue to work with, like Python or Rust. It does fit the niche of server and daemon applications quite well.
For someone who already has several years experience with programming, through open-source, side-projects, and commercial work, what can they gain from a university computer science course?
Initially I doubted Cornell’s CS2112 course, taught by Andrew Myers. The curriculum sounded like a standard Java class, going through various aspects of the language before touching upon data structures, GUI toolkits, and HTTP servers. And superficially, that was the course. Where CS2112 succeeded was in taking those concepts and challenging us with them—sure, we might think we know how a hashtable works, but can we actually implement one? We might’ve used Git for our side projects before, but can we actually work in pairs to write software?
And Myers emphasized software engineering as well, requiring us to design and specify our software before starting on the work. Because our last four assignments were cumulative, those who glossed over design initially were given a quite intimate lesson on what technical debt actually entails.
Overall, the class wasn’t about specific knowledge—it wasn’t about graphs, or data structures, or generics—it was about how to design, write, and test software. How to be a capital-S Software Developer, and how to reason about complex design and implementation problems in the context of a larger project, instead of in isolation as a cute “programming challenge” like so many of us focus on today. Sure, this isn’t the flashy algorithmic wizardry, obsessive performance optimization, or mystic code golfing that so many of us associate with skill in our code jams and competitions. It’s stolid, maybe even a little boring—but it’s what makes programming a craft, in addition to an art and (someday) a science.
For any future Cornell students debating between 2110 and 2112, I highly recommend the challenge of 2112. Just don’t procrastinate.
comments powered by Disqus ]]>For Cyber Monday I purchased C for Programmers with an Introduction to C11, by Paul and Harvey Deitel. My overall impressions of this book so far are rather lukewarm, but writing a book “for programmers” is difficult because it’s hard to define what a “programmer” is. I think K&R C would’ve been closer to what I wanted, but it’s an old—though most likely still quite relevant—book by now.
The initial chapters that I’ve skimmed through have focused on how to run and compile C programs, what a terminal is, what Linux is, what an assignment operator does…all things that I think a programmer should know, and nothing that is really about C itself. Perhaps for someone used to Visual Studio their entire career, or a college student still relatively new to programming, such topics would be useful. But for me, lost between all that noise were the important subjects, such as how C compiles and links code separately. As a “programmer”, what I don’t want is a tedious three page example of running a guess-the-number game. A book marketed as “for programmers” should focus on where C differs from common procedural languages; what I want is C for Python and Java Programmers. Such a book would focus on pointers, memory management, headers, security considerations, bit manipulation, and other areas that C and its ilk are known for, instead of devoting pages to operator precedence (which I’m never going to use because if it’s not clear, I would add parentheses).
This probably isn’t an entirely fair assessment. As a Linux user, I’m already fairly comfortable with the command line and a lot of the other background material (like what Linux is) that perhaps most programmers would like to learn about. But I also think most of the tedious preparatory material—setting up compilers, running programs, and so on—should be farmed out to other books or the Internet. A programming language book should focus on the language. This would work better for languages like Python with a single “standard” runtime, where there is a clear central location where a user can find help and documentation on setting up a development environment.
The book has it’s strong suits: the end of each chapter discusses secure programming and specific problems to avoid in C. I would appreciate more of this material, but at least it’s there.
Ultimately, this book’s problem is that it’s audience—“for programmers”—is ill-defined. Different programmers are going to be comfortable with different things and will need to review different subjects. But I feel the first third of the book could have been far more concise.
Out of five stars, this book would get three-and-a-half. My advice? Skim the first third of the book and focus on the topics that make C different. This book could easily have been much more concise or focused much more on topics and techniques unique to C, perhaps merging aspects of an introductory C book and an intermediate Effective C++ style book—that’s what I would ideally want. It’ll get the job done (and I’ll post a full review in a few months’ time), but it could be better.
comments powered by Disqus ]]>I’ve updated the design of this blog, switching from Jekyll to Hakyll in the process.
I also took the chance to use a free domain from Namecheap, http://lidavidm.me, and restructure this blog to be hosted under /blog
instead of at the root of the site.
This design is inspired by Google’s new Material Design, and in particular the Paper Elements demo. However, it’s not as elaborate as Google’s demos, nor is it animated—I’ve simply taken the superficial look.
Personally I find that Material Design has too much whitespace—Google’s designers know better than I do, but the neon colors and excessive padding in the design guidelines bothered me in general. I have also opted for the Crimson and Open Sans fonts over Google’s Roboto.
Admittedly, the animations in Material Design are gorgeous—the ink ripple effect is a nice touch, and the way the pages smoothly morph into each other makes the UI feel responsive. I have not replicated them because they’re overkill for this simple blog, but using Polymer to develop a web app with these styles is definitely something I will be trying.
I find the use of shadows, paper, and other skeuomorphic effects in Material Design interesting as well—the current trend has been towards completely flat interfaces, which started with Microsoft’s Zune and continued with Windows Phone 7, Metro, Google’s Holo, and Apple’s iOS 6. Material reverses the trend away from skeuomorphism while still retaining the “flat” design that is so popular.
My redesign here is still ongoing—I’d like to differentiate myself from Material Design further and create my own look (though this design will clearly be inspired by Material).
Github Pages has Jekyll built in, making it quite convenient to use. However, depending on Github to run Jekyll on my repository was frustrating; it took a while for the site to be rebuilt, and I had to wait until then for error messages. There also isn’t a way to disable processing for certain pages—everything gets run through Jekyll, whether I like it or not. Running Jekyll locally would have solved these issues, of course. Instead of Jekyll I decided to use Hakyll, a Haskell site generator.
Hakyll specifies configuration in Haskell code; instead of running “the” Hakyll processor, your configuration is compiled into a site generator, which you then run to output the actual site.
I’ve been quite busy these past couple months and haven’t had a chance to really work on SymPy or write posts. I promise more is coming soon, about a TypeScript/Canvas project that I’ve been working on for a few months now.
2015-02-01: New About page.
2014-09-12: Added back Disqus commenting, updated home page, tweaked styling, merged about/contact, added RSS.
comments powered by Disqus ]]>Say you’re an aspiring writer. You’ve toiled endless hours working on your book, telling the tale of a lieutenant and his betrayal of his general, picking each word carefully, giving your sentences that twist, perfectly capturing the image of the spring vista outside your window, the backdrop of your epic. You publish your book and immediately make it onto the New York Times bestseller list. Time to sit back and relax…
One day, you open your mail and see a legal notice1. Another writer, someone named William Shakespeare, says he has a patent on stories about lieutenants who betray their masters. That’s ridiculous, you say. Plots are ideas and that Shakespeare can’t patent them!2
What are your options?
Well, you remember reading a story similar to Shakespeare’s, published long before the Bard received his patent. Perhaps you could challenge him in court and invalidate the patent? How much would that cost?
$500,000—in legal fees3. That’s not counting the settlement you’d have to pay if you lose.
Pick yourself up off that floor. There’s more to that legal notice.
But, you whine, how am I supposed to know?
Good question. Here’s another one: what’s the difference between a patent on “a means of communicating a story in which an official betrays his superior” and a patent on “a method of communicating a tale about the betrayal of a general by a subordinate”? How are you supposed to search for patents that could cover your story, given that you could word and reword and reword the same idea multiple ways? Or that you could have multiple slightly different ideas that still cover your story?
And let’s consider what Shakespeare could demand from you. He could ask that you license his patent—in effect, that you pay him money to use his “invention”. He could ask for an injunction—that you censor your book and not publish or sell it. Or you could refuse, and rewrite your book to avoid his patent.
This is still ridiculous. You can’t actually patent stories! And indeed you can’t. But this is just an allegory—now consider this real-life example.
Now, you are Fred Chang, CEO of Newegg.com, a successful online electronics retailer. And how do people buy from your store? They put products in a shopping cart, of course. Just like every other site, just like they do in the physical world.
And one day you (i.e., some drone in your legal office) receive a legal notice from Soverain Software. They claim to own the idea of a shopping cart. Of course, they refer to it as a “network sales system”, and want their rightful share of your profits. Amazon has already agreed to pay $40 million. Soverain wants $34 million from you. Every other company targeted has agreed to settle. What would you do in Newegg’s shoes?
In the case of Newegg, they decided to fight. Newegg lost the first trial. Indeed, the judge didn’t even let the jury hear their argument that the patent was invalid! Luckily, upon appeal, the judgement was overruled and the patent declared invalid. A good ending for Newegg, and an example of justice: a predatory company that makes no products, a patent troll, has its patent invalidated.
But in the real world, not every story has a happy ending. And not every aggressor is an obvious troll. Let’s look at Samsung vs Apple, the two titans on the sides of the Android phone vs iPhone battle.
In 2011, Apple filed suit against Samsung, claiming that Samsung’s products infringed upon Apple patents for such features as double-tap-to-zoom and pinch-to-zoom, seeking $2.75 billion in damages4. Apple was originally awarded $1.05 billion, still a staggering sum, and of course Samsung has appealed, the award has been changed, etc. As of this writing, the retrial is still dragging on. Samsung could easily see its products banned from the US market, leaving consumers with less competition in the smartphone market. And what products remain could face price increases, as manufacturers must then license Apple’s patents or face litigation. In fact, HTC has already agreed to license from Apple, under unknown but likely onerous terms.
What is the purpose of patents? To protect inventions from copycats. Is an online shopping cart an invention? (Yes, said Soverain.) Is slide-to-unlock (warning: Internet meme) an invention? (Yes, says Apple.) Is a little tray that holds all your notifications an invention? (Yes, says Google.)
But are these inventions? Or are they ideas?
This is the world of software patents. Lawsuits are only increasing, the number of software patents is steadily climbing, and the damages are now in the billions. Is there a solution? Any number of academics, policymakers, and others have proposed reforms, laws, bans. But ultimately, it remains up to us—the consumers—to make this problem a priority, not just an abstruse debate.
For more information, check out the End Software Patents wiki.
This analogy comes by way of Richard Stallman, by the way.↩
See the USPTO’s website: “A patent cannot be obtained upon a mere idea or suggestion.”↩
“The Private Costs of Patent Litigation” (2008), page 16.↩
See, for instance, the Huffington Post story.↩
SymPy Gamma is essentially SymPy’s clone of Wolfram|Alpha, at least in terms of math features. We just rolled out some new features:
Improved plotting, still based on D3.js
Multiple graphs on one plot:
Polar and parametric equations:
Diophantine equation solving: diophantine(3*x**2 + 4*y**2 - 5*z**2 + 4*x*y - 7*y*z + 7*z*x)
Recurrence relation solving: rsolve(y(n+2)-y(n+1)-y(n), y(n), {y(0): 0, y(1): 1})
An updated version of SymPy, and various other improvements
Gamma's (and by extension, SymPy's) parsing still needs work, though, when it comes to the implicit style. Consider, for instance, the input y(x)
. Is this
This makes inputs ambiguous, at least to the parser. For instance, it makes sense that expand(a(x + 1) + b(x + 2))
should be \(ax + bx + a + b\) and not \(\mathrm{a}(x+1) + \mathrm{b}(x+2)\). Meanwhile, rsolve(y(n+2) - y(n+1) - y(n), y(n))
wouldn't make any sense as \(\mathrm{rsolve}(ny + 2y - ny - y - ny, ny)\).
How do we solve this problem? I see three, non-mutually-exclusive ways:
Function("f")(x)
, though this is rather tedious to type.So in the future, I'd like to improve SymPy's parser more -- and hopefully, someday add natural language parsing to Gamma. Dealing with these ambiguities is a first step towards that.
comments powered by Disqus ]]>This is a work-in-progress and will be periodically revised to reflect changes in SymPy. Last updated: 2014-6-21
One of the annoyances of entering mathematics on the computer is the rigidity of the format the computer generally expects. In SymPy, sympify()
won’t accept any of the following, though a human would:
2x
5 sin x
6(9)
sin x^3 + y
ln sin x
sin^2 x
xyz
SymPy’s implicit parsing aims to fix this. Implicit multiplication takes care of statements like 2x
, symbol splitting allows for xyz
, implicit application enables sin x
, and function exponentiation allows sin^2 x
.
But first, we have a problem: some of these statements are ambiguous. For instance, should sin x^3 + y
be interpreted as sin(x^3) + y
, sin(x^3 + y)
, or sin(x)^3 + y
? The last one, to a human, wouldn't make much sense, but the other two are perfectly valid. For SymPy, we’ve arbitrarily chosen the first interpretation.
That’s not all. Let’s look at the last case more: xyz
. To a human, this represents x*y*z
, so naturally to handle this SymPy should simply split the name into three. But we can’t split every token; consider these:
x_2
alpha
Splitting these names would not match user expectations at all.
So taking these considerations into account, how should we implement implicit parsing? We have a few options:
(AST transforms aren’t an option because this syntax isn’t valid Python syntax.)
Let’s look at the SymPy parser. If you just want to read about implicit parsing, jump to the section about token transformations.
Note: all data here is based on SymPy 0.7.3
Overall, SymPy follows these steps. You can follow along in sympy_parser.py
:
Yes, we use eval()
. It’s not safe. (SymPy Live and SymPy Gamma deal with this by ignoring the problem: they run on Google App Engine, which is sandboxed.)
The tokenizer handles some syntax not in vanilla Python. In particular, we’d like to be able to parse expressions like these:
x!
(\(x\) factorial)x!!
(\(x\) double factorial)0.[123]
(\(0.\overline{123}\))Note that the last example is valid Python—it’s equivalent to (0.)[123]
. But to make it easier to recognize and transform, we modified the tokenizer to handle it as a special case:
# Standard library
>>> tokenize.tokenize(StringIO('0.[123]').readline)
1,0-1,2: NUMBER '0.'
1,2-1,3: OP '['
1,3-1,6: NUMBER '123'
1,6-1,7: OP ']'
2,0-2,0: ENDMARKER ''
# SymPy
>>> sympy.parsing.sympy_tokenize.tokenize(StringIO('0.[123]').readline)
1,0-1,7: NUMBER '0.[123]'
2,0-2,0: ENDMARKER ''
!
and !!
are simply operators now:
>>> sympy.parsing.sympy_tokenize.tokenize(StringIO('x!!').readline)
1,0-1,1: NAME 'x'
1,1-1,3: OP '!!'
2,0-2,0: ENDMARKER ''
>>> sympy.parsing.sympy_tokenize.tokenize(StringIO('x!').readline)
1,0-1,1: NAME 'x'
1,1-1,2: OP '!'
2,0-2,0: ENDMARKER ''
Unfortunately, since this is based on an older tokenizer, it doesn’t support Python 3 features, such as bytestrings, and it’ll incorrectly parse longs in Python 3, not to mention it doesn’t support Unicode.
Say you’re making a SymPy calculator/grapher/what-have-you. The user inputs this expression:
sin(x) + 3
and your app spits out this error message:
NameError: name 'x' is not defined
Not a very good calculator. SymPy handles this by transforming the token stream; in this case, undefined variables are wrapped in Symbol()
calls to turn them into SymPy symbols. By default, parsing uses these transformations:
Symbol
s.3 + 4j
becomes 3 + 4*I
)Rational
s (0.[123]
becomes Rational(123, 999)
).Float
s.Integer
s.factorial
or factorial2
as appropriate.SymPy also provides other useful transformations:
^
, becomes exponentiation.Note that SymPy’s transformations always produce valid Python code, since we’re not modifying the parser.
The API for this is a bit hidden. Here’s an example:
>>> from sympy.parsing.sympy_parser import parse_expr
>>> parse_expr("1/2")
1/2
>>> type(_)
<class 'sympy.core.numbers.Half'>
>>> from sympy.parsing.sympy_parser import standard_transformations,\
... implicit_multiplication_application, convert_xor
>>> transformations = (standard_transformations +
... (implicit_multiplication_application,))
>>> parse_expr("2x", transformations=transformations)
2*x
>>> transformations = (standard_transformations + (convert_xor,))
>>> parse_expr("x^3", transformations=transformations)
x**3
How do the implicit parsing transformations work? They’re split into four transformations: symbol splitting, multiplication, application, and exponentiation, which must be run in that order, though not all need to be run.
Symbol splitting is quite simple: for each name in the token stream, the transformation checks if it 1) is part of a Symbol
, so that it doesn’t split function names 2) is not a Greek letter, and 3) does not contain an underscore. If so, it makes one Symbol
for each letter in the original name.
Implicit multiplication scans the tokens two at a time, looking for one of these conditions:
(x + 2)(x + 3)
)(x + 2) sin(x)
)pi (x + 3)
)pi E EulerGamma
)Application (and multiplication, though it’s not necessary) depends on an intermediate step that groups function calls into one token—sin(x)
, while represented with seven tokens normally (['sin', '(', 'Symbol', '(', "'x'", ')', ')']
), is first grouped into one “token” (called AppliedFunction
in the source). Then it follows these steps:
This way, sin x^2
gets parsed as sin(x^2)
, not sin(x)^2
. There’s some more logic to make life easier for the function exponentiation transformation as well.
To implement function exponentiation, think about how the token stream would look at this point. All functions are applied, so for sin**2 x
, we would have something like this:
['sin', '**', 'Integer', '(', '2', ')', '(', 'Symbol', '(', "'x'", ')', ')']
If you have implicit multiplication enabled, it’ll actually look like this (note the extraneous multiplication):
['sin', '**', 'Integer', '(', '2', ')', '*', '(', 'Symbol', '(', "'x'", ')', ')']
The transformation has to figure out what constitutes the exponent and what constitutes the function call. The rule SymPy uses is, essentially, the exponent is everything from and including the exponentiation operator to the first closing parenthesis it sees (this rules out anything but simple integer and symbolic exponents; sin**(x+3)(x+5)
won’t be parsed correctly). It then parses the following tokens, discarding the extraneous multiplication if it exists, to find the closing parenthesis of the function call (this correctly handles nested parentheses), and moves the tokens for the exponent to the end. So what the parser ends up seeing is
['sin', '(', 'Symbol', '(', "'x'", ')', ')', '**', 'Integer', '(', '2', ')']
which is equivalent to sin(Symbol(x))**2
.
One final note: SymPy uses a an evaluation trick for the final result. SymPy defines Symbol.__call__
so that this works:
>>> spam = sympify('f(x)')
f(x)
>>> x = Symbol('x')
>>> eggs = Function('f')(x)
f(x)
>>> spam == eggs
True
However, it’s a point of contention whether this should be done at all.
So there you have it. SymPy’s parser.
comments powered by Disqus ]]>