The hidden challenges of rebuilding products |

At some point, every engineer I’ve ever worked with has said “Let’s rebuild this product”. One day, when evaluating a SaaS product, the feedback I gave my CEO was: “Instead of buying it, we should just build it from scratch. We’ll get more bang for our buck”. Ultimately, we decided neither to buy it nor to build it. However, knowing what I know today, I wouldn’t have been so fast to suggest to rebuilding that. The challenges one faces when building a new product versus when rebuilding an existing one are two completely different challenges. The best comparison is trying to build a brand new house, or rebuilding one while also living inside of it. Now, let me tell you a little bit about what we faced here.

A background story

Here at Codelitt, we are about to deliver one of our biggest projects to date. It is an application that has been around for many years. The customer wanted to improve the experience by redesigning and rebuilding the frontend completely.

For context purposes, the user journey goes like this:

User logs in on the new application
User gets redirected to a domain handling different entities they can have access to
User selects an entity and gets redirected to the new application that we built.

As it is common on rebuilds like this, the user needed to be able to access the old application, and slowly migrate over to the new one. Thankfully, the project was already structured to be a step by step process.

This was not our first time doing something like this, it’s actually pretty common. What we’ve never done before was an application of this size. The application has 250+ screens for about 12 different user roles, with different permissions for each.

The challenges we faced

The project started about 2 years ago with the intention of building a new application that would replicate the functionality of the old one, but in a newer stack. The goal was to improve the speed, security and user interface of the application.

As the time passes, a lot of things change in terms of people, processes, requirements, and decisions, so now I’ll talk a little bit about the main challenges we faced when rebuilding this application.

1. The API wasn’t built with the rebuilt frontend in mind.

This is a very common issue that we normally face when rebuilding a frontend. The API was built for a specific frontend, and every time we encounter a situation where the data isn’t model properly for our screens, we face an issue. For each situation like this, we had a few options:

a) Refactor the API

This is the best option, but a lot of times we were dealing with 10+ years old legacy code, which made this almost impossible.

b) Build an additional endpoint

While not ideal, it is the second-best option. However, sometimes the data we need wasn’t readily available at the API level.

c) Bend the data on the frontend

This is not ideal, especially if a lot of transformation is needed, but sometimes it is the only option when the client doesn’t want to spend a lot of resources to improve the API, and there’s no logical place to put the data we need.

We decided to go as much as we could with option A and B. But obviously, we ran into edge cases that were unsolvable and had to go with option C.

2. Different behavior for the same functionality in different places

My first instinct, and I believe for many as well, is to reuse functionality. In the rebuild, we would often find a page that has a functionality that’s also present somewhere else, let’s call it an approval flow.

There are 250+ screens, and they all have been developed over a period of 14+ years, so a lot of things have different behavior based on the task they need to accomplish. A certain approval flow could behave differently on an Expenses screen than it would behave on a Sales screen. It’s very counter-intuitive for a developer to understand that what seems reusable is actually not reusable.

To tackle this issue, we did 2 things together:

a) We created our components in a way that they can be configurable.

b) We set up proper requirements gathering for the approval flows by taking a deep look at the application behavior in the process.

3. Business logic was buried inside the frontend

We found a lot of business logic buried in the frontend of the application. This always comes across as a shock, but if we think about small teams building an app, it is not a surprise. It is very common for small teams to need to deliver fast, and sometimes that means that they will write code wherever they have access at the moment. This comes with a lot of drawbacks for maintenance and code quality.

Another important thing to address related to this is that normally teams start out small, without a well-defined process to decide where to put stuff. Sometimes the legacy code we face is from what used to be a startup that grew over time.

Since we were keeping the API as it was, most of these issues we couldn’t refactor entirely. But even if we could, due to the fragile nature of the application (at risk of breaking other parts of the app), the best option is normally not to touch it and replicate the behavior.

4. Database that doesn’t reflect the current state of the application

Another issue that we faced was dealing with databases that didn’t reflect the current state of the application. Over time, things change, and sometimes the database has old columns that aren’t being used anymore, or even worse, columns that are being used by other parts of the application, unrelated to the old one.

This problem is almost impossible to overcome since there’s a lot of reverse engineering needed to discover what data is used and what data is not. Sometimes, the organization even has processes that run only from time to time, that could use data we see as unused.

Our solution was to keep the database intact, but make sure we were using the right data on our builds.

5. Lack of consistent definitions

This is one problem we face when dealing with existing infrastructure. Sometimes the naming of the entities is not consistent. One example is a user in the application could be referred to as User, UserModel, Person, Staff, and others. This is a common problem that happens over time, especially when teams change.

Our approach to this was to keep the naming consistent on our side, but map them to the corresponding entities on the backend.

6. Functionality that needed to look exactly like the old one

Since we were rebuilding the UI, a lot of the functionalities needed to look exactly like the old one. This is a problem because we have a new design system, and we want to use it.

But why copy the old one? Beyond legacy muscle memory, there are critical features involving complex visualizations that are hard to improve on. These can include complex tables with specific functionality, or domain-specific visualizations.

Our approach was to make the functionality look as similar as possible, but still, make it follow our design system as much as possible. This has worked out alright for us.

7. Understanding edge cases

For me this is the hardest of the problems: how to discover all the edge cases in a 250+ screens application? The answer is: you don’t. You just trust that the requirements you got from the client are right, and that your approach solves it.

But let’s be realistic, in 10+ years of application, even with the best documentation team (which most of the clients don’t have), the edge cases aren’t documented. Sometimes they are fixed in a quick patch, then they’re kept in silence. So, how can we make sure we deliver a product that works for all the edge cases?

We can’t. But what we can do is to make sure we have a good testing system. So that’s our approach: to implement a testing system that can catch as many edge cases as possible.

The conclusion is that, for a proper rebuild, we need to have an extensive reverse-engineering process. This is extremely costly for our team to properly understand the application’s current behavior, and be able to recreate that behavior in the new application. Here, we have a two-sided view of the situation:

a) Since we have an existing code, it makes it easier because we don’t need to spend time thinking about which data flow to follow

b) Since we have an existing code, it makes it harder for us to rebuild the application the right way because we need to follow the current data flow

I ultimately believe that it is a mix of both, depending on the situation. A lack of freedom can be a good thing because it limits our options, and it can also be bad for the same reason. Having the team having reverse-engineer the code in a different framework is also problematic, because it takes lots of time.

But there is a bright side!

This was not our first time modernizing an application, and it won’t be the last. Once we recognize these challenges, bringing new life to old user experiences can be done in a timely (and often in a budget) manner. Next I’ll talk about how to successfuly avoid these traps and set your team for success when rebuilding a large application. Stay tuned!