When to Pick Google Bigtable vs Other Cloud Platform Databases

dbsRecently, Google Cloud Platform announced the availability of an additional database option for our customers: Google Cloud Bigtable.  Now Cloud Platform Developers have another Google managed solution for storing their data. So, when do you use which tool?

The contenders:

Cloud SQL

Are you using some sort of relational database with tables, views, and  indices? Does your application rely on stored procedures and custom table views or write joins? Do you need the certainty of transactions and ACID compliance?  Do you generally both read and write to the data, not in equal amounts, but not lopsidedly one or the other?  If you answered “yes” to a lot of these you probably want to investigate or stick with Cloud SQL, which as its name suggests, is an implementation of MySQL hosted by Google.

Cloud SQL does have all of the limitations of MySQL. Certain types of applications don’t require the complexity of normalized data with no duplication. Other cases require scaling in a manner that SQL cannot handle without additional complexity, reduced performance, or higher cost. Going into the full pros and cons of SQL vs NoSQL is beyond the scope of this post, but rest assured there are reasons that both of them exist, and both are valid choices depending on circumstances.

NoSQL: Datastore or Bigtable

If you are leaning towards a NoSQL solution you now have two Google managed NoSQL choices on our platform. What is the differentiator between Cloud Datastore vs Cloud Bigtable? They are both NoSQL solutions. Both are described as massively scalable.  Both leave you with little to no management (or No Ops for those of you playing buzzword bingo).  The answer lies in four areas:

  • Size
  • Structure
  • Analysis
  • Interface

Size

Bigtable is optimized for mind boggling huge sets of data. Seriously, it is most cost effective when dealing with datasets that start at 1 Terabyte. Datastore can handle large data sets too, but Datastore is performance and cost optimized to handle smaller sets of data too.  Have a few GBs of data? Datastore would be the better call. Have data that might start out small, but grow to a Terabyte in time, still Datastore. Have data that starts at a Terabyte and will keep expanding? Then you’ve started down a path that might make Bigtable interesting. But size alone isn’t the only factor.

Structure

Bigtable stores data in a big honkin’ table. Yeah, the name is a little on the nose, but it’s true. There are rows and columns somewhat like relational database systems but not exactly. But it has a schema, and predefined structure.

Cloud Datastore, on the other hand, is more optimal for ad hoc storage of structured data representing objects.  Basically you define an object and then push it into Datastore. You don’t define a schema, create tables, or set up any other sort of structure before storing a record.

Analysis

Do you need to analyze the data in massive aggregate scale while the database is still online and taking requests? Do you want to run MapReduce on your production data without copying it somewhere for study? Do you want to hook it up to various Big Data analysis toolkits?  If this sounds like what you want to do, Bigtable makes more sense.

Interface

If you are coming to Google Cloud Platform from other technologies, and are working with HBase, Bigtable is for you.  Bigtable is accessible through extensions to the HBase 1.0 API and is therefore compatible with a lot of the Hadoop ecosystem as well as other Big Data tools.

On the other hand,  there are also a few limitations. You cannot join. There is no SQL interface. The API gives you Put/Get/Delete individual records, or you can run Scan operations.

Datastore does not have SQL either, but has an API called GQL that while not exactly the same does abstract querying objects in a way that most SQL developers should be able to quickly understand.

Conclusion

Finally the product page has a great explanation of Bigtable’s relation to other Google Cloud Platform offerings:

Cloud Bigtable and other storage options

Cloud Bigtable is not a relational database; it does not support SQL queries or joins, nor does it support multi-row transactions. Also, it is not a good solution for small amounts of data (< 1 TB).

  • If you need full SQL support for an online transaction processing (OLTP) system, consider Google Cloud SQL.

  • If you need interactive querying in an online analytical processing (OLAP) system, consider Google BigQuery.

  • If you need to store immutable blobs larger than 10 MB, such as large images or movies, consider Google Cloud Storage.

  • If you need to store highly structured objects, or if you require support for ACID transactions and SQL-like queries, consider Cloud Datastore.

In short, there is a lot of awesome stuff about Cloud Bigtable, but it doesn’t mean that it is right in all cases.  It’s a NoOps, NoSQL, Big Data analysis tool, meant to be used at massive scale in conjunction with other Big Data tools.  I recommend that you check out the documentation for Bigtable as there is much more to be found there. And let me know if you need more clarification on anything.

PHP on App Engine Updating to PHP 5.5

A few weeks ago Google Cloud Platform released an update for PHP on App Engine that enabled PHP 5.5 on App Engine. It was all very exciting, and there was a forum post about it any everything.  At the bottom though there is a little note:

After 16th April, 2015 we will begin automatically migrating all applications to the php55 runtime.

You may have also seen emails from  ”Google App Engine” reminding you of this.  In those notices it has been changed to:

in approximately 2 weeks we will begin automatically migrating your application over to use the PHP 5.5

If you need an extension on PHP 5.4 you can fill out a form to request one.

So this change is coming, and this post is yet another heads up to remind people.

Lumen on App Engine

lumen

Laravel announced today that they are launching a new PHP framework named Lumen for building API’s and microservices.  It has an emphasis on speed and it is compatible with a subset of Laravel, making for easy migrations to the larger framework.

I wanted to see how easy it would be to run a Lumen app on App Engine, and I fooled around with it a bit. You just do a simple setting in app.yaml to route all of the apps traffic to the default handler for the Lumen app:

With that I got a basic instance running. I did run into one problem though.  The logger tried to write to disk, which is a no-no for App Engine. If this happens you will get an error like this:

Fatal error: Uncaught exception 'UnexpectedValueException' with message 'The stream or file "[Path to your lumen app]/storage/logs/lumen.log" could not be opened

One little configuration tweak got it working though. In [path to your lumen app ]vendor/laravel/lumen-framework/src/Application.php, you have to tweak the logger a bit.

First you add a reference to the right library:

And then replace the function getMonologHandler() with this:

This advice was taken from the guide for Laravel on PHP for App Engine.

All in all, it is a relatively quick little framework, and having using App Engine to scale out the services it provides seems like a no-brainer.

All code shown here is licensed under Apache 2. For more details find the original source on Github.

Sharing Memcache Between Languages in App Engine

memcache diagramIn the process of performance testing the ability to swap out languages in App Engine detailed in this post, I stumbled on to something.  I was testing performance, and realized that the tests weren’t being accurate because of differences in caching. Ideally, to get the tests to be apples to apples, I would just have to get my PHP code and Go code to use the same Memcache instance and keys.  (I should have written my testing better, but then if I had I would never have stumbled into this.)

To start, follow the steps to get multiple languages working in a production instance or a development instance.

Assuming you are writing from PHP:

And then to read from Go:

It really is that easy. Now the hard part comes when you want to transfer complex data between the two.  Use JSON to encode the objects.  Both languages can handle it pretty effortlessly and Go on App Engine has JSON object handling built in as a codec to its memcache implementation. You could save it in another format like XML then read and write data like a string, while manually encoding and decoding.  You could also staple your had to your desk. Let’s not be a masochist and just do it in JSON – but I suppose it’s your choice.

Once you do that, it’s as simple as encoding to JSON in PHP:

Then decoding in Go.

Note a couple of things:

  • I omitted graceful memcache miss handling. I did so for brevity. Make sure you wrap your memcache code that handle cache misses.
  • If you are not familiar with Go, those ‘json:’ comments aren’t just comments, they’re instructions on how to encode/decode data between Go and JSON.  So you need them, or it won’t work correctly.

  • I ran into an issue with the original version of this code because latitude and longitudes were coming out of the database into PHP as strings and not floats. When you went to get them out of memcache in Go, it would through a type mismatch error.  There are 2 solutions to this:

    • cast them correctly to floats before you write to memcache

    • Use JSON_NUMERIC_CHECK in json_encode to get them to write as proper numerics when you write. This seems like the better solution

Why do this?  For starters I was doing it so  both versions of my API could take advantage of caching done by the other language.  But I am sure there are other uses:

  • Communication between these modules

  • Offloading an expensive data retrieval and processing step to Go then reading memcache from PHP.

  • I’m curious if anyone reading has any thoughts.

Note: This will work on either type of memcache solution on App Engine: shared or dedicated. Just make sure you handle cache misses gracefully.

 

All code show here is licensed under Apache 2. For more details find the original source on Github.

 

Two Languages in App Engine Development

In my last post I outlined getting Go and PHP to act as modules in the same App Engine instance.  However I only really tested it on a “production” App Engine instance, I didn’t test it in development, because I typically use the Google App Engine SDK for each respective language separately.

When I tried the combined dispatch.yaml on the Google App Engine SDK for PHP I got the following error on a Mac running OS 10.10.2 (Yosemite):

OSError: [Errno 2] No such file or directory: '/Applications/Development/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/goroot/pkg/tool’

When I tried the combined dispatch.yaml on the Google App Engine SDK for Go I got the following error:

The development server must be started with the --php_executable_path flag set to the path of the php-cgi binary.

And when  used the -php_executable_path option with any of the copies of PHP on my system – including the ones that are buried in the PHP SDK – I got:

_PHPEnvironmentError: No input file specified.

After struggling a bit with this here is the easiest solution I found:

  • Find the location of goroot in the go_appengine folder

  • Find the location of the SDK for php by running

  • Create a symbolic link to goroot in go_appengine in the PHP SDK folder that contains dev_appserver.py

After that you can test your dispatch file in development by running:

Where api_php is the folder your PHP module is in, and api_go is… well you know what I’m saying.

Now, I went out of my way there to say this was the easiest way of doing it.  Not that it wasn’t a hack, or that it was a supported way of doing it.  But it does work.

All code show here is licensed under Apache 2. For more details find the original source on Github.

Two Languages in One App Engine App

AppEngine_512pxThe other day I was talking to students at a bootcamp about languages. I made the comment that language performance can vary depending on what a particular language is best at doing. When you run into performance issues it can sometimes be helpful to try rewriting pieces of your app in a particular language for a performance boost.

I thought about how that could be done in App Engine. Let’s say I have a section of an application that I wrote in PHP, but it was getting more load than expected, so I need to boost its performance. I want to try and see if Go could give me the boost I need. How hard is that to do?

Please keep in mind all of the caveats here.  Sometimes you can get a boost, sometimes it’s worth exploring. You know, it was a theoretical conversation. And for the record. This need to drop to another language doesn’t have to be performance related. It could be due to SDK or API restrictions, or developer knowledge, or just plain “I want to use another language to do this.”

In App Engine we do this through the use of modules.  Modules allow us to separate front end and back end code from each other.  But they allow us to break up large applications into manageable chunks.  In this case, we’re going to use them to allow us to break up code into multiple languages.

Let’s assume that you have an application with an app.yaml that looks like this:

Let’s say that you want to swap out the distance method for go. The first thing you need to do is write a dispatch.yaml, which looks like this:

This will redirect all calls to your App Engine app to the Php application above. Which is what has been happening to date. But this is a setup step for later.  You then have to add the dispatch file to your application. In a command prompt, from the folder containing dispatch.yaml, run:

Write a replacement for your distance method in Go. Go on, we’ll wait…

Ok, assuming you’ve done that you write out an app.yaml for the Go code you wrote:

Take note of the module name. It has to be different from the original app’s module, which should be “default.”

Once you have all of that handled you need to tweak your dispatch.yaml to replace calls made to the php version of the distance method to the Go method:

Rerun the dispatch update:

And there you go, the original PHP service will answer all other calls, but the Go service will answer calls for /distance.

Running multiple language solutions in the same App Engine instance can solve some problems for you.  It also has a few interesting ramifications.  These include the ability to use the same shared Memcached instance between Go and Php. I’m going to show that off in my next blog post.

All code show here is licensed under Apache 2. For more details find the original source on Github.

PHP on App Engine Does cURL

AppEngine_512pxA nice upgrade came about in the 1.9.18 release of App Engine SDK: PHP on App Engine can now support cURL. There are a few caveats that go with it, but it’s a nice step forward.

There are two implementations: cURL_lite and full fledged cURL.

To Enable cURL_lite

  1. Add the directive google_app_engine.enable_curl_lite = “1″ to your php.ini file.

Caveats

  • cURL_lite is only allowed to make calls to HTTP or HTTPS clients
  • cURL_lite didn’t work on my local development server without tweaking runtime to php55, but it works for php in production
  • cURL_lite doesn’t require application to have billing enabled

To Enable cURL

  1. Change your runtime setting in your app.yaml from php to php55.
  1. Add the directive extension = “curl.so” to your php.ini file.

Caveats

  • cURL is only available in App Engine’s PHP 5.5 implementation
  • cURL can only be used by applications that have billing enabled
  • cURL is limited by the restrictions of App Engine’s sockets but include:
    • Limited from targeting Google domains
    • May be reclaimed after 2 minutes of inactivity

Now regardless of the implementation, you still call cURL using the “curl_” commands, just the underlying technology changes.

Supporting Documentation

What To Expect from Cloud Security Scanner

I got to experiment a bit with Google Cloud Security Scanner yesterday, and wanted to share with you my experiences, set expectations and what not.

  1. It’s a Front End test. We spin up a bunch of Chrome instances and have them go at your site as a browser.  We aren’t scanning your code on the server side.  We’re testing as if we are on the outside trying to get in.
  2. It’s App Engine only. You get to it through the Developer Console menu for App Engine. It’s not a general purpose scanner.
  3. Read the documentation.  Everything I was confused by for even a moment was noted there. The thing that confused me most was the fact that I ended up getting 150 or so email from my contact form.  Once I understood what was going on, I was all cool with it, but at first I was wondering what the heck was going on.
  4. It’s going to take a while. It scanned 1607 urls on my site in 1 hour 23 minutes. It’s doing a comprehensive scan, while rendering pages in Chrome and running XSS tests. It also limits its requests per second to not become a nuisance.
  5. There is no charge except… The scan does not have a charge associated with it.  However it is making requests of your site, and those requests count against usage and quota. That being said. For me, it didn’t even cause a dent in my usage and quota and I have them all set pretty low. Obviously your mileage may vary depending on the nature of your site.  But for my relative small traffic WordPress blog, running with default quotas, it didn’t cause a blip.

Read more about it, especially the Getting Started Section. If you have an App Engine site, give it a try, fool around with it, and tell us what you think.

…Hello Google

Starting December 1st, I’m going to be a Developer Advocate for Google Cloud Platform. It’s a similar role to what I’ve done before: go out to events or reach out online, and talk to people about technology that can help them. But Advocates are less about marketing than Evangelists, and more about product improvement. The idea is that while we’re out talking to people, we listen to their feedback and bring it back to the product teams. Evangelists do that too, but my gut feeling is that organizations with “Advocates” take that feedback much more seriously.

I’ll be talking about an awesome product. Or more accurately, suite of products. From Platform as Service and Virtual Machines to Storage, Databases, and Big Data queries, there is a lot to talk about, and lots of rabbit holes to wander down. I intend to wander down a few of them and bring you all along.

I’ll be talking to developers again, which is awesome. The past few years found me drifting further and further away from the developer communities that inspired me to get into this line of work 6 years ago. My work angst for the past 12 months and the work and projects I did to prepare for and secure this job made it very clear that this is what I really want to be doing.

I’m joining a team of intimidatingly smart people. And I do mean “intimidatingly” cause the interview process is as challenging as all the rumors make it out to be. But everyone I met along the process were incredible to interview with, and I can’t wait to start working with them.

I find myself reporting once again to Greg Wilson, and I honestly couldn’t be any happier about that. Good managers are both rare and more important than people think they are. When you find one, count yourself lucky, and if you can work for a manager you’ve confirmed is good, well, you do it.

Google culture encourages workers to informally collaborate. They find that keeping people in the same space yields better collaboration. And despite all of the advantages to working remotely I missed the serendipitous hallway meetings. So after 6 years remote, I find myself returning to daily commutes. I always said I couldn’t go back – but then again, when there is free Coke Zero, showers, nap pods, and brilliant co-workers – maybe it might be even better than working from home. I’ll miss seeing my kids the way I used to, but frankly, now that they’re in school, I don’t see them as much as I’d like to anyway.

You might be asking: Hey, does Google have an office in Philadelphia? Actually they appear to, but it’s not an office with any Cloud engineers. So my family and I are leaving Philadelphia for somewhere in the Bay Area, probably San Jose. This was not an easy choice, but I am very excited about the prospect. We’ll be around for the rest of 2014, with us moving in the beginning of 2015.

So let me finish by pointing out that none of this would be possible with out the encouragement and support of my wife, Janice. She was my practice interviewer, cheerleader, and sounding board. When the very people interviewing you point out that “Imposter Syndrome” is a huge part of the interview process, it’s hard to not to get lost in your head second guessing yourself. Janice was consistently convinced that I could get the position, and even helped me convince myself sometimes. And when I did get it, she agreed to move across the country to a place where we have no roots, with 2 children in tow. Not only did she agree to it, she embraced it for the opportunity it is. That doesn’t mean it isn’t terrifying for the both of us, but at least for me it is less so, ’cause she’s going to be by my side.

So there you have it, lots of change, I think they’re awesome changes, and I can’t wait.

How did you become an Adobe Evangelist?

Yesterday via twitter, I was asked a very ironic question:

So tell me @tpryan how does one become an @Adobe evangelist?! I must know.
ThinkCreativeKC

I figured I would give answering it a go. Keep in mind that I did this 5 years ago when Adobe was trying to do very different things. I don’t know that this would land you at Adobe anymore. I distinctly think it wouldn’t. See the job I originally landed was “Developer Evangelist.” I slowly morphed into being a broader design focused evangelist over the past 5 years as Adobe’s focus on developers waned and more and more people were focused on Creative Cloud. So this wouldn’t work at Adobe today but it could land you at a developer focused evangelism/advocate role at another company.

Discover the role
My first introduction to the idea of an evangelist was Ben Forta in his role as ColdFusion Evangelist. I remember at the time being wowed that there existed a job where you had to fool around with new technology, blog about it, and talk about it at conferences. That seemed like a dream job, and I figured it wasn’t a career that you could plan for. It wasn’t until later I discovered that Ben wasn’t in a one off situation. There were developer evangelists all over the place.

Network for the role
A good friend of mine whom I met working at The Wharton School, Ryan Stewart, also was very much into the idea of being an evangelist. He ended up in the role before me and confirmed for me that was in fact an awesome job and that I could would be a good fit. I also met the a couple people connected with the product I really wanted to evangelize, ColdFusion. I connected with Ben Forta, Adam Lehman, and a few of the product managers. I also participated in the pre releases for the product, and got myself involved with Adobe’s user group community. All of these things gave me good connections and good name recognition with the people who would hire for the evangelist position. That wasn’t necessarily the reason I was doing any of it at the time. I was doing it cause I loved playing with the latest and greatest tech, and the community was very rewarding, but in retrospect these things helped me a lot.

Prepare for the role
At some point I decided I wanted the role, and I constructed the outline of a 5 year plan for getting the job. I looked at the externals of what an evangelist did. They experimented with the technology, showed how you could integrate it into other technology, and then they blogged about it and spoke at conferences. So I played with tech, got it to do new things, and then blogged and spoke about them. The idea was to prove I could do the job, before I was actually doing the job. This combined with my networking led to bigger and better speaking gigs, which allowed me to network more, which became a positive feedback loop.

Get Lucky
At this point I was a member of a pool of likely candidates for the role. I had applied once before. I knew everybody involved and had shown I could do the job. Then my friend Adam Lehman got hit by a car in London and was travel limited for a few months creating an opening for a replacement. And just like that my 5 year plan happened in 2. Luckily Adam recovered, and went on to do great things in product management. But it’s a terrible way to luck into a job.

For me it came down to being the right person and the right place at the right time. Some of that is preparation, and some of that is luck. You can control being the right person, in my case prepping for the role. You can have some control getting yourself in the right place, getting myself on the short list was partially in my control, by networking, but someone else made the call to keep me on that short list. And I had no control over Adam being hit by the car despite what some people may claim.

Some of these things would have to be updated for the current moment. Do you have to blog? Or is tweeting a combination of gist’s and github projects enough? Maybe, maybe not, but the main point here is that you have to explore tech and then share your findings. Are corporate sponsored users groups still as impactful? Or do you need to focus on meetups and regional conferences? Again the details aren’t as important as the fact that you are finding where peers and trend setters are, and engaging with them there.

So there you have it. Pretty much the way you get any other role. Figure out you want it, prepare your skill set for it, network with the people who do the hiring, and then assassinate anyone in your way be ready to take the opportunity if it comes up.