Advanced Software Engineering - Spring 2020

Project instructions

Version 1.3 (May 19th, 2020): Included the instructions for the submission and presentation.

Version 1.2 (March 29th, 2020): Switched the midterm project presentation to be submitted as a document.

Version 1.1 (March 2nd, 2020): Added the specifications for project proposal presentations and deliverables.

Version 1.0 (February 15th, 2020): First version.

The goal of the project is to produce an application that processes and visualizes data. Teams composed of three students will work by applying most of the software engineering processes presented within the lectures.

For this project, you have only a few strict requirements in terms of the software engineering processes you and your team have to apply during the development. The rest is a mix of creativity. You are free to use any programming language and technology you prefer, as long as you follow the requirements. You can take the chance to experiment technologies you never managed before, or stick with the knowledge that your team has. The risk analysis is part of your job here.

πŸš€ Application requirements

Here we describe the few application requirements you have to follow. We do not want you to be forced to use any programming language or technology in particular. What is most important is that you always motivate your choices. You will have many occasions to do so, make use of the documentation and presentations.

Your application is based on two main components: a backend and frontend. The backend is in charge of exposing the dataset you chose as an API. The frontend is what the user sees.

πŸ“’ Expose an API

The backend exposes a dataset in the form of an API. One option is to use a REST API. However, you can also consider other modern API methods, for instance GraphQL. It is up to you to decide how to store and process the data. You can use the backend for the majority of the computation task, but you can also decide to move part of the computation to separate components or even to the frontend of the client.

πŸ”— Combine with other data sources

The data we suggest might not be enough to devise some functionalities. For this reason, you have to include another source of information. It can be either another dataset, which you should expose as an API as well, or external sources, such as Twitter, or maps.

πŸ–ΌοΈ Visualize the data

Your creativity is asked for this part. You have to come up with the functionalities for your application that are going to characterize it, rather than realizing a simple dataset visualizer. The data you expose or process has to be visualized in a frontend. It can be either a desktop application, web application, or mobile app. You have to connect the frontend to the backend and show it to the user.

πŸ’Ύ Datasets

In the following, the possible topics for datasets are presented. The links reported in here are referring to a version of the dataset on the kaggle website. However, you are not forced to use exactly these copies of the dataset. You can also use other versions of the dataset, but you have to keep the same topic. Moreover, you can mine your data if you believe that your dataset does not contain enough information for the purpose of your application, but please notice that the mining part is not a requirement for the evaluation of the project.

Google Play Store

https://www.kaggle.com/lava18/google-play-store-apps

The dataset contains information about around 10,000 apps published on the Google Play Store. It includes the rating and price of the apps, but also the top 100 reviews for all the apps.

Apple App Store

https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps

The dataset contains around 7,000 apps from the Apple App Store. The date includes information on the categories, ratings, and pricing.

New York City Airbnb

https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data

The dataset describes the listing activity and metrics of advertisements on Airbnb, in the city of New York in 2019.

Soccer players

https://www.kaggle.com/karangadiya/fifa19

The dataset includes the attributes of every player registered in the latest edition of FIFA 2019 game.

Movies

https://www.kaggle.com/rounakbanik/the-movies-dataset

The dataset contains the metadata of about 45,000 movies released before July 2017. The data includes information such as the cast, crew, plot and keywords. It also contains the 26 million ratings from 270,000 users.

Game of Thrones

https://www.kaggle.com/mylesoneill/game-of-thrones

The dataset combines three sources of data from the George R. R. Martin’s book series. It includes the data for all the battles, and the deaths of characters.

PokΓ©mon

https://www.kaggle.com/abcsds/pokemon

The dataset includes 721 PokΓ©mon species from the first 7 generations of the games. The data includes their number, types, and basic combat statistics.

πŸ“‹ Project requirements

In the case of the project development process you have a bit more requirements to follow.

πŸ“… Project planning and version control

For the project we want you to apply an adapted version of the Scrum methodology. We want to track your interaction activities for each of the phases of the project development. For this reason, you have to use a git repository, combined with a project management platform. There are several options around, but we suggest to use the GitHub platform since it already has everything you might need.

You have to use the Kanban or Scrum boards for organizing your sprints and tasks. You can use GitHub projects functionality to do so, since it can easily be associated to the users involved to the repository, the issues that you publish and solve, and especially the source code itself. You also have the option to use Trello. Please, note that we will get some inspiration from both Scrum and Kanban methodologies in this regard. However, since we do not have the structure to reproduce an actual Scrum process, we suggest you to be flexible especially regarding time. For this reason, Kanban boards might be a better choice instead of Scrum boards, since there is as continuous flow for project improvement of the project, instead of fixed duration sprints and rules.

As for the source code, you have to store it into the git repository, together with documentation. All the members of the team can work independently and merge the code every time a conflict occurs. You are required to use at least a master branch and many development branches you need. One suggestion, is to use the git workflow branching model, but you are not forced to use it. Instead, you are asked to use pull requests every time you want to merge the code with the master branch, to discuss and review potential changes with your collaborators.

πŸ“– Documentation

You are required to document your project and source code. You should apply an Agile philosophy for the documentation. Try to not postpone the moment for documentation. First of all, we need to understand what you did and how your application works. Second, you are working with other people and a proper documentation is very useful to report the behavior of your code.

First, you have to provide a README.md file in the root of the project, as well as of every micro service you define. As for the source code, you can use a documentation generator for the language you choose, for instance Javadoc. Never forget to document the tests code.

You have then to maintain a wiki for the project. Every GitHub project has a parallel repository where you can store Markdown files to compose the wiki. We are expecting you to describe your project, its architecture, and how to use it. In particular, we want to understand the motivation for all of your choices.

You can use the wiki also to document the usage for the API of the backend. As an alternative, you can use specific API documentation tools, such as Swagger, in the case you decide to use REST.

πŸ“¦ Containerization

Your application has to be engineered as a cloud application, meaning that it can be easily deployed in the cloud, and scaled to balance if the load increases. Every microservice of your application has to be put into a separate Docker container. You have to provide a Dockerfile for each of the microservices that you developed. Also, you have to provide a docker-compose.yml file to launch the entire application.

πŸ‘Ύ Tests

Always produce tests for the all the source code you produce. Unit tests are welcome. You have to include tests running during the continuous integration flow.

➰ Continuous integration

Your application has to follow a continuous integration flow. In particular, every time your code is modified, especially for the master branch, you have to automatize the execution of your test suites. For instance, you can use Travis CI.

You are also requested to compute the code coverage of your tests, for instance by combining Travis CI with Coveralls.

You have also to include a quality check phase. To do it, you can use SonarQube.

Finally, you have to build and push your container images in a public Docker registry. The most of continuous integrations services allow to do it.

If you are hosting your application, you can add an automated process for deployment. However, it is not required for the evaluation.

⏱️ Sprints

Sprint 1 - Registration

Events:

Working time: from 17.02.2020 to 23.02.2020 (1 week)

After the kick-off lecture, where we present the organization of the project together with the description of the dataset options, you have to register your group. The deadline to complete this is the day before the second lecture.

You can register in teams of up to three students. Each team is required to register by sending an e-mail from the team coordinator of your choice. If you do not have a team, you can register individually and we will then assign you to a team.

The final groups will be announced during the lecture.

Sprint 2 - Project proposal

Events:

Working time: from 24.02.2020 to 08.03.2020 (2 weeks)

Once the groups will have been chosen, you will have almost two weeks to come up with an idea about the application you want to propose.

First, you have to produce a short document in which you describe the purpose of your application, and the planned architecture. It has to be clear which technologies and programming languages you intend to use and a plan for the project.

The day of the presentation, you have to prepare a presentation to be shown in front of the class. You can use your time to show the things you mentioned in the document. There is no evaluation at this point, the purpose is to collect feedback about feasibility of the proposals.

Your group will have 15 minutes of time slot, including the time for questions. This is the time reserved for each group. However, if need be, we can continue discussion after every group presentations. You can use slides if you wish but it is not mandatory. If you have alternative ideas, you are free to propose them as well. Just use your time wisely.

After the collection of the feedback, you can actually start working on the project. However, you have to submit a revised version of the proposal document within a week.

Please, note that the chosen technologies and programming languages can be changed during your next work. However, you should have at least an preliminary idea about them to have better feedback.

Sprint 3 - Start to work

Events:

Working time: from 09.03.2020 to 29.03.2020 (3 weeks) from 09.03.2020 to 03.04.2020 (4 weeks)

This presentation is not part of the evaluation but serves as monitoring and to give you feedback about the project until this moment.

You will have to submit a document or slides including information about the current implementation/design status and the way you are organizing the software project in terms of process. Please, remember, you need this specifically to get feedback, so make use of this occasion wisely.

Sprint 4 - Project finalization

Events:

Working time: from 30.03.2020 to 11.05.2020 (6 weeks)

You have to show the work you concluded with a presentation. It will be part of the evaluation.

πŸ“„ Deliverables

πŸ“„ Proposal document

You have to produce a document of 1-2 pages in which you state your idea for proposal. The important part you have to include is the selection of the dataset and business logic of your application as a list of functionalities. The rest is not mandatory at the moment. Feel free to add some motivation or user stories if you wish.

πŸ“„ Project submission

You need to wrap up everything you produced. We need a Markdown README.md document as a reference/instructions for your entire submission. You are free to attach any additional files you believe would be useful, but please insert their references to README.md.

Hopefully, you have organized everything into a wiki already. If you did so, please give some indications in the instructions file.

🌟 The keyword here is motivation: if you made a choice, please specify a motivation for it.

Source code

For the source code, you have to freeze the repositories you produced. First, create a tag for you repositories marking a release for the project. Then, if you are on GitHub, you can easily export an archive file with everything packed. Otherwise, you can use the following command:

git archive ${COMMIT} --format zip --output ${FILE_PATH}.zip

Please, report also the links to the repositories. If private, grant us the access.

Project organization

We need to understand how you, as a group, scheduled the tasks for your project. Since you were asked to use an adapted Scrum methodology, we need to understand what you did exactly. You can attach screenshots of the Scrum/Kanban boards you used. It is important that you let us know the process you applied. Please, report a timeline that gives an overview of the development. You can use a Gantt chart, for instance.

Also, we need to understand how you organized the versioning of your code. One suggestion is to extract the git branching graph and put some description on the model for branching you used, if any.

Finally, report on the use of pull requests your were asked to use. We need to understand how you took advantage of them.

If you reported the above mentioned information as part of the wiki, just mention that.

Documentation

We need a documentation of the whole overview, since you should have organized everything in microservices. Anything useful to better understand your code will be appreciated. Also, mention the use of specific technologies and their motivation.

As for single microservices, we expect to find a README.md file in the root of any repositories you used.

Also, you should have documented the source code, thus report the way you did it, such as which documentation generator you used, if any.

Finally, report the documentation for the API.

Please, include the link to the wiki you produced.

How to run it

Consider this part of the submission as the description of the way to run everything in production.

Since you were asked to put every services in separate containers, they have to be buildable and runnable. Thus, we expect to find the Dockerfile and docker-compose.yml files to run the whole application.

Of course, exceptions may occur in the case of particular applications. For instance, there could also be the case that you built a mobile application, therefore report everything you find useful in this regard.

Testing and continuous integration

The testing was part of you job. Report everything useful to describe the way you planned and performed the testing of your code.

Also, we need information about the continuous integration flow you applied. Testing was probably part of it, thus you can decide to describe it here. Also, mention any quality check steps you applied and if you took that in consideration during the development.

🎀 Presentation

You have to prepare a presentation of 15 minutes. It is required that everyone in the group will have a part in the presentation. Please, avoid a continue switching between people, but separate in distinct parts: for instance, if you are three people, have three parts. At the end of the presentation, there will be a questions and answers time of a maximum of 5 minutes.

There is not a best way to present your project. You need to imagine it as a mix of marketing time, in which you want to describe your product and its amazing features, but also a report on the project. Here you find some of the things you might want to put into your presentation:

Any kind of visual representation, for instance videos or demos, are welcome πŸ™‚.