A Final Grade Notifier in Two Nights
A couple of nights ago, my friend Stephen Poletto and I undertook a project to build a system for Brown University students that would automatically notify you when new final grades have been posted to your account. As any college student knows, this is the time of year when most free moments are consumed by frantic refreshing on your final grades website. It’s obnoxious, and we wanted a way to break away from that system.
The final product is simple: you register for the application, and whenever new grades are posted to your account, you get a text message and an email. You can even reply to the text message, and the app will call you and read your grades to you (for privacy reasons, we don’t put your grades in the text message). In spite of (or perhaps, because of) its simplicity, the architecture behind the system is — in my oh so humble opinion — a work of genius.
We had two major goals when implementing the system that we knew couldn’t be sacrificed: Fort Knox-level security, and a distributed, scalable workflow. For those interested in learning more about the internals, follow along with the code on GitHub.
The Workflow
Brown uses Banner for its grades, and as anyone who’s ever used Banner can readily tell you, it’s… slow. And very broken, in many ways. For that reason, among others, we decided early on that having a single server querying Banner for grade information wouldn’t suffice, especially since we wanted to give Banner a little breathing room between requests from the same IP address so as not to frighten it. We decided, therefore, to use a distributed architecture that looks a little bit like this:
How it works in a nutshell: a user from the Interwebz visits our website to sign up. When they do, the web server makes a short request to Banner to verify that they entered a valid username and password, then stores their credentials and contact information in the database. It then kicks our internal manager application to tell it that a new user has joined, and should be inserted at the front of the worker queue. Finally, it displays a “Congratulations” message to the user, and they’ve been registered.
The manager application, meanwhile, is where the action really happens. At server start-up, it creates a queue of all of the users registered for our application, sorted from least- to most-recently-updated, which is automatically restacked every hour (users will also be inserted on an ad-hoc basis immediately after sign-up). Worker nodes, running on separate machines (we’re currently utilizing a separate web server, a Mac Pro, and a laptop on the side), then ask the manager for a new user every 10 seconds or so. If there are still any users pending in the queue, the server peels them off, and hands the username and password to the worker.
It’s then the job of that individual worker to talk to Banner, check if there are any new grades published, and alert the manager of the results. The manager then updates the database, and if there are new grades, emails the user with a grade report, and makes a call to the fantastic Twilio API to send the user a text message. If at any point a registered user texts something beginning with “g” to the assigned phone number, Twilio will inform the manager, and the manager will then instruct Twilio to call the user and read them their grades.
The beauty of this architecture is that any number of worker nodes can be running on any number of machines in unison, all pulling users from the queue. With a single worker, we can handle a load of about 120 users and still provide hourly updates to all of them without ruffling Banner’s feathers. With just 5 workers, some of which can be running on the same machines, we can handle up to 600 users, and it scales easily from there. Five lightweight virtual machines each running 10 worker nodes could theoretically gather grades for the entire Brown undergraduate population once each hour. With VMware Fusion or similar, one physical machine could run the entire operation.
Making it Secure
We knew that if we wanted to legitimately ask our users to give us their Banner login information, we needed to make the system secure. Like, really secure. And really secure we made it. Let’s walk through the workflow again, but this time with security details.
When someone visits our website to sign-up, they’re switched over to SSL right away. At the advice of our friendly security guru and cohort Neal Poole, we sucked it up and spent the money for a real, honest-to-goodness SSL certificate from GeoTrust, not just a cheap-o self-signed one. The benefits: a little green lock icon in Chrome (and a blue badge in Firefox, and a grey lock in the top-right corner of Safari). Win.
Beyond the added user comfort, this ensures that all traffic between the user and our front-facing web server is encrypted end to end, which means it’s very, very difficult for anyone to eavesdrop on the connection. AES-256-CBC with a 2048-bit key. Ye-ah. We use the same SSL encryption scheme when the worker nodes phone in to the manager application, and also use SSL when communicating with Banner and Twilio.
Once a user’s username and password have been validated during the registration process, the password is encrypted using the server’s private key before being stored in the database. In order for a worker node to be able to use the credentials it receives from the server, it must be manually bestowed with the server’s public key. This also means that we (as administrators) can look at the database, and never actually see anyone’s password in plain text.
For added security, in order for a worker node to even be allowed to request credentials from the manager, it must identify itself using a mechanism very similar to SSH public key authentication. Whenever we setup a new worker node, we generate a public/private key pair specifically for that worker. We then store a copy of the worker’s public key on the manager server, and associate it with the worker’s unique ID number. Each request from the worker to the server must be signed using the worker’s private key, and is verified by the manager before being processed. The requests and signatures are also timestamped, which helps to prevent replay attacks (which would be useless anyway, because the credentials are encrypted, and therefore useless to anyone but a valid worker).
So let’s sum it up: public/private key encryption on the passwords, public/private key signing and authentication with each worker, replay attack resistance, and it’s all done over an encrypted SSL pipeline. Not to mention a randomized password on our database, which itself is behind a firewall, and not listening on any Internet-facing interfaces.
Needless to say, we put a lot of work into the security.
Sidebar: Caller ID Spoofing
Since the application has the capability of communicating via text message and phone call, we also had to take a number of other security issues into account. For one, the “phone on the desk” scenario. We decided not to put your grades in the notification text message itself, because we didn’t want a situation where your phone goes off across the room, your friend leans over to look, and just says “man, sorry about chemistry dude.”
Thus, in order to get your grades by phone, you can have the system read them to you, privately. That does raise the issue of caller ID spoofing, however.
One of the original models we explored was one in which you could simply call the phone number dedicated to the application, and it would fetch and read your grades to you. However, this would be vulnerable to caller ID spoofing: if I knew you had an account on our app, I could spoof your caller ID, and have your grades read to me.
We solved the problem by only providing meaningful information in outgoing communications. When you text “g” to the application, it will call you, and read you your grades. This is immune to caller ID spoofing: let’s say I’m being nefarious, and I spoof Josiah Carberry‘s caller ID on a text to the application. The application will call that number back (if they’re a registered user) and read the grades over the phone, as expected. The caveat is that that call — regardless of the origin of the text — will go to Josiah Carberry, because it’s his phone number that made the request! Foiled again, caller ID spoofers.
What We Won’t Do
I feel like this is an appropriate time to make a statement about the future direction of this project, since people have made a number of suggestions already: we will not, under any circumstances, at any time adapt the tool to allow the pushing of information to Banner. That means that we will not turn this into an automated course registration system. This tool is a convenience, and nothing more; giving it more capabilities could give its users a competitive advantage over other students, which is not only against Brown’s Acceptable Use Policy, but very much unethical.
We also at this time do not have any intentions of deploying this tool for other schools that use Banner. Our code is, however, open source, so if someone at another school was interested, they could use it for inspiration. It will certainly need some modification, though: our source is not published for the purpose of use by others, but rather as an act of transparency to make our users even more comfortable.
Conclusion
As side projects go, this was a fun one. And as Xzibit would say:




Comments
Grant Butler said…
Just realized that my school runs Banner, too. I’ll totally be adapting this! Thanks for open sourcing it!
Liam DeBeasi said…
Wow, sounds like you put a lot of work into this over the course of a few days.
Quick question though…
Is Banner as bad as Edline? I mean, the Edline login page did get a facelift recently: http://edline.net
It also has some strange thing going on with the login button: Users can only get that blue-ish button once. They can get it by hovering over it. If they want it again, they need to reload the page. I guess they wanted more page views…
Add a Comment