Monday, June 8, 2009

Stuck in Issue Sensordata Design

Issue sensordata is so different from other sensordata that it does not conceptually belong to a particular user. Instead, it belongs to a software project, but this project is not the same concept of project in Hackystat system. In Hackystat, a project is just a definition to group up users and data to represent a actual project, however, there does not necessary exist an associated actual project. The problem is, all Hackystat sensordata belongs to a user. We have to decide a user to process the data, and that user has to be sure to stay in all the projects which the issue may belong to for the whole life time of the projects. This is like a administrator in the project. But this administrator need to be managed by users, not the system. There is not an administrator defined in Hackystat users, and we dont want to make this exception for just one kind of sensordata(it is not necessary for others).

The first design is to store changes of an issue, starts from the creation of the issue, followed by its updates. Then the issue sensordata is assigned to the owner of the update/creation time. The major resource will be the RSS of the issue updates. The good of this is that it keep all the update in the help of RSS, and the owner of the data is reasonable. But the shortage is that the RSS provides limit information. In the creation thread, it only include the comment. In the update threads, it only include the state/labels that being changed. The current unchanged state is unknown from RSS and has to be found out from the issue tracking system(via http in most case). Another problem is when analyse the data, all data started from the project start time have to be gather together to get the view of the given time. It might be a lot computation if there is lots of updates.

The second design is to make a single sensordata associated to a single issue. The updates will be store in the properties list of that sensordata, from the same data resource: RSS. In the creation of an issue data, the current state/labels will be extract from issue tracking system, then it is easier to keep track of future changes. Also, it is easier to analyze, only need to go through that single data instance to figure out the state of a given time. However, the problem of this design is the owner of the sensordata, because it has to know the owner to get the data, and in project level analysis, that data owner has to be in the project which the issue should belong to. There is no a reasonable way to answer this question without making some hack or modifying/adding current system definition. It is possible to let user define who the data belongs to, but it is unsave because it require not only the user know excatly what he is doing, but also all sensors collecting data for the same project need to be configure excatly the same. Otherwise, there may exist mutilple copies of the data instance, which is a great fault of the single data assumption.

1 comment:

austen.ito said...

Can we have project-level sensordata? For example:

3.2.4 GET {host}/projects/{projectname}/sensordata

in addition to:

3.2.4 GET {host}/projects/{owner}/{projectname}/sensordata

I think it would be useful to retrieve all data for a project through one request. You could limit the project's data to only "issue" data with query strings.

You would need to change the API and the route handling, but that might be an option.

This is a tough one. Not sure what the best way to handle it would be.