This project will require the creation of an MVP in 3 weeks to meet an urgent demand with the best architecture allowed by this timeline, later we will convert into a FastApi + React application and expand its scope.
The application should:
1 - Orchestrate the execution of 100+ of different robots that will scrap websites for our clients in order to consolidate a wide range of documents and invoices. A team is being assembled to create the robots (you can see the details in our other job post). We might use a service like ScrapingHub if it makes sense.
2 - Route each piece of data received from the robots into one of 5 domain controllers, according to the type of supplier the data is comming from.
3 - Have the business rules handled by 5 domain controllers that must have strictly defined boundaries and responsibilities. (making it easier to convert them to microservices in the future)
4 - Follow best practices for security (strict CORS policy, define access policies for all boundary functions, etc)
5 - Have its business rules defined using BDD techniques (Gherkin) - we will not use BDD for testing the MVP, only for clarity of comunication, but plan to use it with Behave in future versions. You will NOT be required to write anything in Gherkin, but will participate in discussions to define it and will be expected to implement the behaviours as agreed.
6 - Expose a GraphQL API.
7 - Be very performant and scale efficiently.
8 - Have good maintainability.
9 - Have the UX and API completely decoupled from the domain controllers.
We are thinking of using Django for the MVP. Maybe only for the presentation logic, without ORM and forms, making it easier to migrate to FastApi. But this is open for debate, and we welcome your thoughts on it.
Points 1, 4 and 5 are mandatory in the MVP. Option 6 might be postponed. The other points might not be met by the MVP if they risk our hability to deliver in 3 weeks of development.
We may have up to 8 developers working in parallel if necessary, one for the robots controller, one for the GraphQL API, one for the presentation logic and UX, and one for each of the five business domains. Following point 3 would probably make this easier to be managed.
If you have ever participated in any project with a software architecture similar to this, please let us know. If you disagree with any of the points above for technical reasons, we would love to hear about it. Nothing is written in stone for now, but once decided we will move fast.