We are very sorry for the downtime! It is painful for the whole team when something is not functioning right and we are disappointing our customers. You can be sure that lessons have been learned and we are working hard to prevent such failures in the future.
What happened:
There was an unprecedented spike in client activity. Some of our services didn’t scale properly and the performance of the system degraded dramatically. As a result, many of you couldn’t log in for about an hour.
How are we going to do prevent such failures in the future?
Short-term:
We’ve found the weak spots and we are working on optimizations.
Long-term:
We’re setting new scalability goals so that our infrastructure is ahead of our client growth.
Side notes:
We should have posted a “We have a technical problem” message on Twitter so you would at least knew what was happening. But that wouldn’t have stopped people from making negative comments. There’s no elegant way to handle such situations. The solution is to prevent them.
The reality is that until yesterday we’ve had a stellar uptime. Yes, there was a similar issue in 2016. That was 4 years ago.
We want to be completely honest with you: we cannot guarantee you 100% that there will be no downtime in the future - it’s a very complex system and we are releasing new features which you demand at an unprecedented rate for this industry.
What we can guarantee you is our total dedication on preventing it.
@AlexK While I agree with pretty much 99% written.
I have to add one point.
Working with Enterprise software, big corporations , various stakeholders.
Biggest complaint clients have on vendors is usually lack of communication.
While I agree that proactive communication will not calm 100% of public and there will always be small percent of negative folks/comments, whatever you do.
However I believe in this day and age CX should be at level where service providers should have folks dedicated to communication with general public in good and bad times.
This will put you way above your competitors, no X Y Z feature can do what a simple proactive/transparent communication can do for your client relationship/experience…
that was a big problem because it interferes with our money but we have to see both sides here, sometimes there is not much someone can do and from the explanations, this seems to fit.
having that said, you are on the right track, keep it up.
Unfortunately, bugs and issues in technology usually only appear when such high stress scenarios occur out of nowhere. last time I believe it was to do with crypto and this time it was a result of tesla. The upside is that next time the stress would have to be far greater to cause any issues and they may not reach a similar result as these last two times.
If the community notices any movements in the market that may seem to generate a lot of interest, it would be great if we can return the favour by at least warning the Team such that they can look at any preparations that may need to be done.
proactive cooperation is often the best solution to both prevent and remedy most problems.
You send us contract notes and trading updates via email every day, could you not send an email when the service is down? (Am I being IT illiterate?).
I’ve noticed that the app is hanging a bit more than it used to, I assume that this is as a result of the above: bigger stock universe, more features, more customers, etc? One to watch as can be a wee bit annoying sometimes. Not too worried as a regular stock investor but would be a little worried if I was a CFD trader.
i suppose it is important to differentiate that while some will perhaps be losing lots of money, most people may only be losing a little or simply wont be able to make any.
People appreciate transparency and communication and given what happened with Robinhood lately I hope it’s a good lesson for everyone in the market as that can cost you and your customers most of all.
Never underestimate a communication piece regardless how small or what people say.
Thanks for your efforts in communicating this and working towards a solution. Hope improvements and scalability are on their way.
RH down today lol. I saw someone pointed out that in the code they did not take into the account that it’s a leap year, and it made it try to access data in the future that caused a lot of errors