31 May 2018
Following on from this week’s Connected, I thought I’d do my own wish list/happy list for WWDC 2018.
Here are my top 5 hopes for next week’s announcements …
1. Siri improvements
Siri is currently so far behind Alexa and Google Assistant it’s a joke - not just on quality of results but because of the very restricted set of domains developers respond to.
I’d love it if a much more flexible way of returning results was introduced. In particular, let developers develop their own intents and grammars to parse Siri queries - just like Alexa and Google do - and not have to wait for Apple to support individual problem domains at their glacial pace.
Bonus happiness if Apple Watch apps could hook into the potentially very useful but currently very limited Siri watch face.
2. Cross platform development
I don’t really believe the rumours of ‘Project Marzipan’, but it would be fantastic if a cross-platform iOS/Mac development platform is announced.
I definitely think some of my iOS apps could potentially work well on the Mac, but I’m not really going to consider porting them unless it was very little work.
Partial happiness if support for pointing devices/mice/trackpads is announced for iOS. Would sure make using my iPad Pro as a work machine a bit easier, and probably a requirement for any cross-platform support going forward.
3. Updated Mac Mini
I used to have a Mac Mini, but it was increasingly underpowered as a development machine so I switched to a 2015 MacBook Pro a couple of years ago.
I can’t really justify splashing out on an iMac right now (or even more an iMac Pro at those prices!), but I’d love a reasonably priced desktop I could use as a development machine - especially one that was always on that I could remote into as necessary.
4. Better WatchKit or full UIKit on Apple Watch
Apple Watch hardware is becoming increasing more powerful, and now mobile connectivity is supported. It has the potential to be really useful for some use cases.
However right now it’s really painful to make any sort of rich interface on Watch with the very limited WatchKit frameworks.
Clearly Apple’s own apps use another more powerful framework - UIKit? - and it would be great if developers are allowed access to this too.
5. Real-time Watch complications
Another missed opportunity on the Watch is being able to update complications on the watch faces in real-time.
The strength of the watch is having time-relevant information presented when you need it. However watch face complications can only be updated on a very restricted schedule, which makes lots of great ideas for real time info right on the Watch almost impossible.
I understand why the underpowered original watches were restricted to occasional complication updates, but we really need move past these restrictions if we want to move the platform forward.
If this means leaving Series 0 Watch owners behind - and I’m one of them at the moment - so be it.
To be honest I’m not really expecting any of these five things to be announced. I’d be pretty happy if any of them are, very happy if 2 or more are, and if all 5 were it would be a miracle!
30 May 2018
I’ve been pretty much all-in on the Amazon Echo as my home voice system, and still loving having multiple devices around our home we can call out commands to.
However I’m always looking to expand Yeltzland onto new platforms, so I’ve ported the Alexa skill I wrote about a while back on to the Actions on Google platform.
This is a summary of what I learnt doing that, and my view on the advantages and disadvantages of developing for each platform.
Actions on Google Advantages
Also available on phones
The main advantage of Google Assistant - one I hadn’t realised until I started this, even though it’s actually pretty obvious - is that it’s available on phones as well as the Google Home speaker.
On newer Android phones Google Assistant might be installed out of the box (or can be installed on recent versions), and there is also a nice equivalent iOS app.
I’ve just bought a Google Home Mini to try out, and it’s definitely is comparable to the Echo Dot it sits next too, but I’ve found myself using Google Assistant a lot more on my iPhone than expected.
Visual responses are nicer
Because the Google Assistant apps are so useful, there is a lot more emphasis in returning visual responses to questions alongside the spoken responses.
Amazon does have the Echo Show and the Echo Spot that can show visual card information, but my uneducated guess is they make up a small percentage of the Echo usage.
Google offers a much richer set of possible response types, which not unsurprisingly look at lot like search results answers.
In particular, the Table card - current in developer preview - offers the chance to provide really rich response which suit the football results returned by my answer very well.
Nice development environment
Both the Actions on Google console (used for configuring and testing your action), and the Dialogflow browser app (used for configuring your action intents) are really nice to use.
Amazon has much improved their developer tools recently, but definitely a slight win to Google here for me. In particular, for simple actions/skills Dialogflow makes it easy to program responses without needing to write any code.
Using machine learning rather than fixed grammars to match questions to intents
Google states it’s using machine learning to build models that match questions to your stated intents, whereas Amazon expect you to be specific in stating the format of the expected phrasing.
Now from my limited testing - and since I’m basically implementing the same responses on both platforms - it’s hard to say how much better this approach is. However, assuming Google are doing a good job (and with their ML skills it’s fair to assume they are!), this is definitely a better approach.
Allowing prompts for missing slot values
Google has a really nice feature where you can specify a prompt for a required slot if they’ve matched an intent, but not been able to parse the parameter value.
For example, one of my intents is a query like “How did we get on against Stourbridge?” where Stourbridge is an opposition team matched from a list of possible values.
Amazon won’t find an intent if it doesn’t make a full match, but on Google I can specify a prompt like “for what team?” if makes a partial match but didn’t recognise the team name given, and then continue on with the intent fulfilment.
Actions on Google Disadvantages
Couldn’t parse “Yeltzland” action name
A very specific case, and not a common word for sure, but Google speech input just couldn’t parse the word “Yeltzland” correctly. This was very surprising, as I’ve usually found Google’s voice input to be very good but it kept parsing it as something like “IELTS LAND” 😞
You also have to get specific permission for a single work action name - not really sure why that is - so I’ve had to go with “Talk to Halesowen Town” rather than my preferred “Talk to Yeltzland” action invocation. It all works fine on Amazon.
SSML not as good
A couple of my intents return SSML rather than plain text, in an attempt to improve the phrasing of the responses (and add in some lame jokes!).
This definitely works a lot better on the Echo than on Google Assistant.
What about Siri?
All this emphasises how far behind Siri is behind the other voice assistants right now.
Siri is inconsistent on different devices, often has pretty awful results understanding queries, and is only extensible in a few limited domains.
I really hope they offer some big changes in next week’s 2018 WWDC - maybe some integration with Workflow as I hoped for last year, but I really don’t hold much hope any more they can make significant improvements at any sort of speed. Let’s hope I’m wrong.
As you can tell I’m really impressed with Google’s offering here, and definitely seems slightly ahead of Amazon in offering a good development environment for developing voice assistant apps.
In particular, having good mobile apps offering the chance to return rich visual information alongside the voice response is really powerful.
My “Halesowen Town” action is currently in review with Google (as of May 30th, 2018), so all being well should be available for everyone shortly - look out for the announcement on
P.S. If you are looking for advice or help in building out your own voice assistant actions/skills, don’t hesitate to get in touch at firstname.lastname@example.org
15 May 2018
After watching last week’s fascinating Google I/O Conference, I’ve been thinking about porting my Yeltzland Alexa Skill to Google Assistant.
The Alexa Skill runs as an AWS Lambda function, and as it was my first attempt at writing a skill the code wasn’t particularly well designed for reuse.
Therefore I thought it was a good idea to find out how to:
- Run AWS Lambda code locally
- Write some unit tests against the code to check it’s running correctly
- Refactor the code to extract the reuseable business logic into a separate module, ready for reuse (using the unit tests to check I haven’t introduced any regressions)
Running AWS Lambda code locally
There are some great tools from Bespoken that make it pretty easy to run your AWS Lambda code locally.
The steps are as follows:
npm install bespoken-tools -g
- Start the proxy server by running
bst proxy lambda index.js where index.js is your Lambda code module
This sets up the Lambda function listening on http://localhost:10000 for requests.
Writing unit tests against the Lambda code to check it’s running correctly
Firstly, I wrote a simple test harness that would build some JSON in the same format as an Alexa request, which then POSTs to the proxy server setup as above and checks the response.
My skill uses dynamic data (my football team’s fixtures and results) that changes over time, so for my unit tests I just wanted to check the first part of the response - generally the non-dynamic part.
This was sufficient for my refactoring efforts, and I didn’t want to go to the effort of mocking the data requests part of my code right now.
I then wrote some simple Mocha unit tests to call each of my skills intents, and verify the response was basically as expected.
Refactoring the code
By adding the following sections to my package.json file, it makes it easy to simply run
npm test to run all of the unit tests:
I then moved all of by business logic to a separate yeltzland-speech module, and checked the tests still passed after each change, and I’m pretty confident I didn’t introduce any problems even though the code logic has been significantly refactored.
If you are interested in the code described here, it’s all at GitHub at https://github.com/bravelocation/yeltzland-alexa/
10 Apr 2018
There has been lots of news recently about privacy, driven mostly by Facebook’s rather laissez-faire approach over the years, as well as the upcoming GDPR changes required by the EU.
I’m pretty happy with the general tone of GDPR, and although it adds overhead to the development process, I think forcing developers to be clear about their use of your data is a very good thing.
Therefore I thought it would be a good time to outline the approach I generally take on my mobile apps around data, and try to justify the trade-offs I’m making.
Crash monitoring using Crashlytics
It’s extremely useful to have a crash monitoring reporting system in place, so I can see any problems as soon as possible. If an app is having non-critical but important problems it’s good that I can diagnose and fix any issues promptly (especially now Apple’s review process is much quicker).
I also get basic usage figures (daily and monthly active users) from Crashlytics that are sufficient for what I need.
Crashlytics is part of the Fabric suite of apps that was part of Twitter and has recently been bought by Google. I really like the simplicity of both their integration process and their website.
Obviously they/Google - like most of their development tools - give this away so they can get aggregated data about which apps are popular, presumably for search ranking and other corporate needs.
I also assume they do send enough information that allows them to piece together a (semi-)anonymous usage pattern for all the apps on a device that use Crashlytics, but that’s just idle speculation.
I think the trade off is (just about) fine, especially as there is no open-source self-service alternative that I know about. I’d definitely be interested in such a solution if it’s was easy to install and maintain, but practically I don’t have the time or interest to roll my own. I’m definitely going to investigate this more though.
A couple of my clients have existing Google Analytics solutions for their websites, so I’ve added GA tracking into their apps so they have a one-stop solution.
I’ll only do this is really necessary, as for me the Crashlytics-only solution outlined above is sufficient. I don’t want to capture more information than necessary. However now that Crashlytics are owned by Google I’m not sure this policy makes sense, and I suspect the products will be merged together in the not too distant future.
For those apps that require remote notifications, I’ve moved over using Firebase to manage this process.
Firebase offer a nice cross-platform solutions for sending notifications, and again offer something I really don’t want to build myself.
Firebase is another Google acquisition, so just about all the caveats I mentioned above for Crashlytics apply here also.
Clearly I’m heavily dependent on using free Google services to provide useful services to help run and maintain my apps.
On iOS at least, Apple’s dedication to privacy means I think the trade-off is a reasonable one as the amount of personal information transmitted is reasonably restricted.
On Android, I’d assume that because the user is almost certainly logged into a Play Store account it’s easier for Google to join the dots on what you’re running on your device, but seeing as their Play Store data exposes that anyway it’s no additional change in your privacy.
In an ideal world I’d like to transition to a fully self-hosted analytics and notification service, but until such a solution exists, I can’t see any practical alternative.
Let me know on Twitter via @yeltzland if you know of a good alternative solution!
07 Feb 2018
I recently had to add a Swift framework into a Xamarin iOS app, and it was really complicated.
There are multiple web pages that try to explain how to do this, but none of them matched my eventual solution. Therefore I thought it would be useful to share what I did, in case it’s of some use to someone wrestling with the same problem.
A client I’m working with wanted to integrate the Visa Checkout SDK into their Xamarin iOS app.
Visa have pretty good documentation on how to do this for a native iOS app, but the latest version of their SDK is written in Swift.
Now Xamarin has a tool called Objective Sharpie which lets you import an Objective-C library, but it doesn’t natively support Swift frameworks.
However with quite a bit of effort you CAN get this to work. This is what I did …
N.B. For other frameworks and/or setups these exact steps may not work for you! Take what you can from these instructions and good luck (you’ll need it!)
Instructions for binding the VisaCheckoutSDK framework
- Install Objective Sharpie - instructions here
- Setup a new binding project in Visual Studio for Mac, using Add -> Add new Project … -> iOS -> Library -> Bindings Library on your existing solution
- Download the VisaCheckoutSDK via Cocoapods:
- Make a new directory
- Run the command
sharpie pod init ios VisaCheckoutSDK.
- This uses Objective Sharpie to setup a pod directory to download the latest SDK code.
- Hack the info.plist file in the downloaded pod to make it look like it was built by the version of XCode Objective Sharpie uses:
sharpie xcode -sdks and note the SDK used for iPhone builds e.g. “iphoneos11.2”
- In the info.plist file in the downloaded VisaCheckoutSDK pod, change the DTSDKName to the value from above e.g. “iphoneos11.2”
- You can now generate the C# binding files used in the binding project you setup earlier by running
sharpie pod bind
- Overwrite the ApiDefinition.cs and StructsAndEnums.cs files in the binding project with the files generated in the previous step
The generated files won’t build out of the box, so need editing to get them to work. This is what I did to make them usable:
- Changed enums in StructsAndEnums.cs to be long
- Added Namespace of VisaCheckoutSDK.Touch to generated files
- Fixed up warnings by commenting out duplicate implementations in generated code i.e. those that have the same parameters etc.
- Removed empty generated interfaces
- Commented out “using VisaCheckoutSDK” in ApiDefinition.cs - not 100% sure why this was necessary!
Hopefully you can then build the linking project, and hence use it in your Xamarin iOS app.
One thing to note is I had to add multiple Xamarin.Swift3.* packages in the main app to allow the Swift VisaCheckoutSDK to run. Without the correct Swift packages, the app failed to load after the splash screen on startup.
It was very difficult to consistent get the error message for the missing Swift packages, as the app often crashed before the debugger could attach and get the error message in the logs. I ended up adding all the Swift packages I could find, and then removing them one by one until the app crashed again. Clearly not a great way to do this :(
This was insanely hard, and took me a good couple of days to figure out all these steps.
Having to do this is definitely a big downside of using Xamarin versus native iOS solutions, and has definitely made me think again about when to use Xamarin for cross-platform projects.