Wednesday, March 20, 2013

Computer Science Education

SIGCSE, aka ACM's Special Interest Group for Computer Science Education, was in Denver Colorado a few weeks ago and it was kind of fitting that even though I wasn't there, I heard about one of the best ideas in Computer Science Education I've ever heard.

Jim Baker, whom I know through +Nicholas Skaggs  and Alex Viggio, is an Adjunct Professor at the University of Colorado Boulder teaching a class in Programming Languages.  The class seems pretty standard, reading assignments, homework, programming projects, papers, etc.  However, the innovative thing is how all of that is structured around GitHub.

I've mentioned this over several posts, version control is a missing element in most computer science education.  It's necessary to understand its benefits and to be used to the way it affects your coding style.  It's dramatically shaped mine, I now try to code in segments so I can commit complete thoughts and I check my code before pushing to the remote server after breaking things for other users.  Bad habits are started early and the earlier developers learn to use a version control system the less bad habits they'll acquire.

Students in Jim's class are required to acquire a GitHub account and register their username with the class. The class uses the students Github accounts to create repositories for them and push homework assignments into the student's repositories.  It's an innovative way to release homework assignments as it allows the instructor to modify them as questions come in or issues are discovered.  It was always annoying when a class website would get updated with a change to an assignment, but you'd only learn  4 days later during the next class session.  This style changes the standard class communication structure of the students pinging the course, in this case the website, to the instructor pinging his/her students "pull down the changes to the assignment".  It's a valuable lesson to learn, pull from your remote repository often.  It's so valuable that +Philip Chase once created a cheat sheet of daily tasks for his developers that had "git pull" as item number 1.  It seemed they were incapable of pulling or pushing to the remote repository and eventually their lack of proper process caused some major malfunctions.

Projects are submitted to a standard "import, test, and evaluate" CSE system.  We had something very similar at UF to check for cheating.  However, in addition to pushing to this system the student must also submit a pull request to the class.  Pull requests offer allow for peer review, you can see a diff of your code repository vs the source repository and receive feedback from the upstream developer (in this case the instructor).  This process is used in most open source projects and is beginning to gain traction in the corporate world with code review systems like Gerrit.  Learning this process adds practical training with their otherwise theoretical class assignments.  It's also provides an extra incentive to code well in the form of extra credit.  Code that does the tasks better than the instructors or in a more innovative way have their pull requests accepted and recieve extra credit.  If a student has tests in their code they can also receive extra credit.

The class has a paper and presentation requirement.  The paper has to be written in MarkDown, a standard wiki code.  Companies are increasing their use of dynamic documentation with corporate wiki's.  It allows the documentation to change on the fly and maintain a revision history without attempting to share a document and intangle yourself with locking and unlocking.  Markdown is also just a step away from Latex, which academics use in creating articles.  The papers, after being submitted, are required to receive reviews from two classmates.  Over the students professional careers they will be required to review and be reviewed when writing documentation.  This unique approach to paper submission readies the students for either the academic world or the corporate world.

I hope that this approach grows.  I know GitHub attended SIGCSE and attempted to expand git as an educational tool.  I'm hoping someone convinces Jim to write a paper about it and document the successes and the failures.  Until then ...

Thursday, March 7, 2013

Software - Tradecraft or Science

Talking with one of my good friends in software development the other day I stumbled upon the question?  
Developing software, is it a tradecraft or is it a science?
The question came up while talking about the skills of various people where we used to work.  We had all kinds, people with masters degrees in software engineering, others with a bachelors degree in computer science.  Some didn't have a degree in software at all; such as my friend whose degree was in music.

What was interesting to note here is that degree should have indicated skill, if we were a pure science.  "I have studied more complicated theories then you and thus I am superior."  However, some of our most accomplished employees were some of our worse.  People with masters degrees and multiple classes in databases came up with some of our worst database designs.  Duplicated information in tables that should have been in a lookup table, using MyISM on a table that should have had foreign keys, the list of tragedies (in my opinion) go on.  Now I'm not an advocate for one normal form over another.  Going all the way out to fourth normal form can sometimes be extreme and a good software engineer should balance all aspects when building a program.  Its this thinking that started the wheels turning; software development or engineering has elements of a tradecraft.

In the sense that it's a trade, time tells you when by the book isn't necessary.  As you work at your trade you develop things like toolkits and acceptable short cuts.  In software we come up with tools for bug analysis, diffing files, profiling performance, writing code.  There are at least a few hundred articles on the web devoted to a developers "toolkit".  Some talk about it like finding that perfect set of wood working tools, where your hand has practically reshaped the handle from use.

In a purely technical sense it's also an art form.  Good code is not just functionally correct, but it's elegant.  Sometimes its the way the code touches the processor, ram, and disc in the most minimal and efficient way possible.  Other times its the way it does large amounts of effort with minimal amount of logic.  I'm not talking about a Perl one liner but simple, easy to read, powerful to use code.

I think like civil engineering and architecture, software engineering hints at the blending of these worlds; tradecraft and science.  The purely analytically with the artistic.  A building built on math alone would be a pretty boring building.  Architects learn to make a building soar.  In the same vein a great software engineer can make a program sing.  Don't get me wrong, we need the science too.  Without it our programs would be like the failed library building where the architect forgot to account for the weight of the books.  Just don't forget the craftmanship.

Wednesday, March 6, 2013


When you've played every iteration of a game, like I have with Zelda, Assassins Creed, or Bioshock you see features come and go as the developers try to make the perfect game.  I've played SimCity since Simcity 2000 on Chris Pearson's old 450Z MHz Pentium II in high school.  I've seen the complexity get ramped up, and the size of the cities explode.  When I played the demo on Des' laptop for the latest SimCity, I was floored at how fun this game looked.  Sure the city was small, but services were simpler and the notification system was great.  We decided to buy two copies so we could play together and eagerly awaited its release.  Perhaps its just release day bugs, but I want my money back.

Stop making the web required

On the Meyer Briggs test I sit just barely over the line towards extrovert.  It's because in certain situations I'm a social butterfly; walking, talking, eating, etc.  However, in some situations I'm not; ie gaming. When I buy a game, I don't rush out to the multiplayer system.  I play single player all the way through and then if some of my friends are playing I'll try out the multiplayer.   I don't want to link to everyone all the time.  I shouldn't have to wait on a server to play my game.  

Many games are requiring online connections as a method of DRM.  If that's the way you want to play it, fine, require my connection at the first log in, verify my information, store my log in locally, and link me when I'm online.  If I have a laptop and want to play a game on the road or in a coffee shop, why am I getting denied because I don't have an active connection.  I can play the new StarCraft without a connection, as long as I had a connection when I installed it.  I won't earn rewards, but that's OK if all I'm looking for is a little fun.  

AutoSave does not replace Save

I don't know how many times I heard at the HelpDesk, "But I had autosave on!"  "Yes", I'd state, "lets check your settings, oh you had it saving every 8 hours, sorry you've lost your paper."  Now, nothing was more annoying in Final Fantasy than getting too into the game and forgetting to save.  Inevitably it was always when you had forgotten for an hour or two that suddenly the game froze or you were horribly killed.  However, you learned an important lesson every time, "save early and often".  

Desiree has exclaimed several times since SimCity has come out, "Crap it crashed" only to return and find her city 10 years younger than when she last saw it.  Just this past evening she was telling me about an arson problem in her SimCity and that she had finally gotten a police station to deal with it just before the crash.  Sure enough when the system let her sign in again, no police station.  

Let people save for themselves.  Let us manage our saves locally.  Its great that we can pick up wherever we left off on another computer, but don't take away the basics to give us this feature.  Simply warn us, "System saves do not transfer from computer to computer."

It's called a cluster

If the servers named North America East 1-3, West 1-2, Eastern Europe 1-2, Western Europe 1-2, etc are clustered servers, then I feel you need to send your engineers back to school.  Why in 2013 am I picking a server node.  I haven't picked a game server in years. Perhaps that's because I play on Xbox more often than anything else.  However, my last PC gaming experiences were League of Legends and StarCraft 2; both online games and I didn't pick the servers with either of them.  

Give me one huge node to hit and spread me and my friends traffic across the nodes with load balancing and clustering.  If you do it right you can simply add more nodes as traffic picks up ala Amazon. I guess that's the biggest problem.  OpenStack, Amazon EC2, and tools like Juju have shown us that clustering servers and spinning up new nodes can be trivial, if implemented correctly.

Tutorial, do I look like I ask for directions?

It's great that games come with tutorials.  If I ever have a problem, I often play through them.  However, if I've played the demo, or previous versions of the game I like to dive right in.  I'm currently stuck (one of the many reasons I'm writing and not playing) because the game says I need to play the tutorial, again (already played it once since it was released) and the tutorial keeps crashing.  

My favorite tutorials are in Call of Duty, go through door A for the tutorial or B to get into the action.  "Do you want the red pill or the blue pill?" Super easy to just jump right in or get a little help first.  

Tutorials are great.  
Good for you making one.  
Can I just play my game now?


It still looks like a fun game, and despite Desiree's frustrations she appears to be having fun.  However for me the game has been out for roughly 48 hours and I've played exactly 1 hour and attempted to play 2 additional hours.  That's the biggest flaw in this game.  If this was my first experience with SimCity, I'd never pick it up again.

Friday, March 1, 2013

Code Academy - Interesting Concept

Alex Viggio, my development lead at CU, sent me a link to the 2012 Crunchies Awards on Tech Crunch.  The winners were some of the usual suspects, but I saw a site that peaked my interest,  

Thier slogan is "Teaching to world to code." Now unlike a certain mayor, I don't believe everyone in the world can code.  It takes a 'different' mind than what everyone has.  I also don't believe everyone can paint, do carpentry, manage a team, fly a plane.  We're all different and we will find things beyond our abilities.  However, just because everyone won't be successful doesn't mean everyone shouldn't try.

I decided to take CodeAcademy for a spin.  It's important to me that people be taught to code in a responsible manner.  Bad habits develop early and lord knows I have my fair share.

The Profile

This is a requirement of almost any site, especially on the 2.0 web.  Unlike a lot of sites that attempt to integrate themselves with every known social site and application, CodeAcademy does a good job of selecting sites that represent developers
  • Github - as a developer if you haven't at least tried git and github then I believe you either have A) been living under a rock for the last few years or B) are my father and work on mainframes.  This is a great place to send new developers.  The first question I usually ask during a interview is "How do you share code?"  I then usually face-palm as they say, "thumb drive" or "email".
  • LinkedIn - lets face it facebook is for family and friends, LinkedIn is for co-workers and employers.  If you want to develop code professionally, you should have a LinkedIn account.
  • Twitter - I know several developers use it to point out interesting coding articles they find.  It's not as important as the first two, but it has been useful in the past.
They have taken a page out of gaming that I love in web 2.0 sites, badges.  You see them all over StackExchange, UbuntuFitocracy, and others.  It's like merit badges for adults, giving a little 'props' for completing something.  It's also a nice way to measure yourself against your peers and see areas where you can explore.  Perhaps this is just turning us into a needy praise driven zombies, but that's a more cynical observation of the reward system.  I perfer to look at anything that motivates us as a positive.

The rest of your profile reflects you current progress.  It shows the tracks you are currently working on and the courses you've completed.  It's a nice portal into your history allowing you to return to where you last left off.


When you first enter the "Learn" side of the the codeacademy interface you'll see a variety of tracks.  Some are coding languages: Python, Javascript, Ruby.  Others are more applied: 'API's, Web Fundamentals, Projects.  I think this is a great start, I just wish that the system had more formal languages,  C, C++, Java, and even some data languages and syntax, SQL, XPath, SPARQL.  All of the languages they presented are dynamically typed languages and I assume have a lighter interpreter since they are evaluated at run time instead of compile time.

Python Course

The first track I started was the Python course.  I already know a bit of python from some google courses and my own messing around, but its not a language I use every day.  I figured a bit of a refresher wouldn't be bad and would help me gauge the course since it's the closest to a newbie that I could get.

The courses starts off slow, going over the basics pretty much in order of any standard language book or first year programming course.  Syntax, basic data structures, conditional statements, looping, advanced data structures; its all laid out.   There are a few courses in the middle which re-iterate things like looping and feel a bit redundant. However, it's a good over view of the language, with a bit of programming basics thrown in for those without computer science degrees.

An interesting element of the track is that each course in the python track is paired with a project.  The projects attempt to apply a scenario to the information you learned in the course.  Some of them are quite well done, and allow you to define everything.  Other's are quite rigid and basically offer a paint by numbers approach.

The only issue I found with this track is that's several sections are quite buggy and do not offer assistance about why you don't pass.  For example, the challenge "Exam Statistics", requires you to computer the variance of a set of grades.  In python it looks something like 

grades = [100, 100, 90, 40, 80, 100, 85, 70, 90, 65, 90, 85, 50.5]
def print_grades(grades):
    for grade in grades:
        print grade
def grades_sum(grades):
    total = 0
    for grade in grades:
        total += grade
    return total
def grades_average(grades):
    sum_of_grades = grades_sum(grades)
    average = sum_of_grades / len(grades)
    return average
def grades_variance(grades, average):
    variance = 0
    for grade in grades:
        variance += (grade-average)**2
    variance = variance/len(grades)
    return variance

Which computes the answer 334.071005917. This is correct when used in the next section.  However, the section tells indicates something is wrong, "Oops try again."  That's the frustrating thing about some courses in code academy.  The feedback and review on the course is determined by the administrator.  Where one course will say "Oh you didn't print the string but the array. Print the string instead" another will just say "Oops" and hope you understand enough about why you failed.  The "Oops" courses seem to have the most restrictive requirements for passing each section and so it only increases the frustration.

HTML Course

I started this course because it was something I knew very well.  The speed of the course was very slow for me, but probably the right pace for my mother.  Its slow, re-iterates itself, and goes through every major structure of a webpage.  Overall if you know the basics of HTML, skip this course, unless you are curious like I was.

Despite the slow and boring nature of the course, I did appreciate the design that went into building it's interface.  Unlike the Python course which had a run environment at the bottom and code at the top, the HTML courses had tabs for various files (such as the index and css files here) and a panel to the right which shows the result of the code on the page.  It has a fairly active autosave, but includes a submit button to force display and call the course evaluation procedures.  It was clean easy to follow along and work in the panels.


While coding isn't for everyone, CodeAcademy does make it accessible to anyone willing to try.  It's courses are well thought out, and it appears to be actively expanding.  You'll find a few bumps in the road with courses that are too restrictive in their completion criteria or not descriptive enough in the requirements to pass.  However, the forums and bug tracker are active and responsive so hopefully time will fix those courses. 

It's not a bad resource for active developers that haven't touched a language or area of development before.  It also has a "Teach" section that allows developers to give back to CodeAcademy and share their knowledge with the community.  

For my part, I may try my hand at the teach section. I'll also keep an eye out for any new courses or tracks.

I leave you with a little poem from the first section in Python (which is from the Python library)

Zen of Python
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Appendix: Some sites about coding