This page is no longer maintained — Please continue to the home page at www.scala-lang.org

About Google Summer of Code Project - Collaborative Scaladoc

1 reply
Xie Xiaodong
Joined: 2010-03-22,
User offline. Last seen 42 years 45 weeks ago.
Hello, Dear All, 

I'm Xie Xiaodong, a master student in KTH, Sweden. I'm quite interested in Scala language. Now I'm working on my thesis project in Ericsson, Lulea. I used Lift in a part of my thesis work, and found that this web framework and Scala language are both quite promising. So I want to learn something more and deeply about Scala, and I think the best way to do this is to participate a real project about Scala. Since I've been familiar with ordinary use of Lift, and this project seems will contribute quite a lot to Scala language itself, I chose it. 
If you'd like, I could send some parts of my thesis work to you, no Ericsson contents included part. 
By the way, I took part in GSoc 2009, and successfully passed the final evaluation by Google. I am quite confident that I could finish this project too. 
Here is some of my initial thinkings: 
I'd like to try to keep the style of ScalaDoc 2 as it is now, and add a "edit" button at the comments part. When user click on this button, a new text-area DOM element will be shown under the comments that will be altered with the original comments copied to this text-area. After this user finishes his editing, those contents will be submitted, and stored in the persistent storage. Considering the comments submitted do not match normal relationship database very well, I myself lean to use some document oriented persistent storage, such as CouchDB, MongoDB. Here I got a question, how could those generated XML document (assuming we use this XML-Scaladoc patching tool mentioned in the introduction of this project) be merged if conflict exists? If I understand correctly, there will be a dedicated person for applying those generated XML file to Scala language code base. Is that so?
By the way, I've got a reply from Gilles as below: 
> Here is some of my initial thinkings:
>
> I'd like to try to keep the style of ScalaDoc 2 as it is now, and add a "edit" button at the comments part. When user click on this button, a new text-area DOM element will be shown under the comments that will be altered with the original comments copied to this text-area. After this user finishes his editing, those contents will be submitted, and stored in the persistent storage.

That sounds like a reasonable solution.

> Considering the comments submitted do not match normal relationship database very well, I myself lean to use some document oriented persistent storage, such as CouchDB, MongoDB.

Why don't their structure match normal relational databases well? Relational DBMS have qualities that other data storage systems do not necessarily have (think ACID transactions, scalable performance, multiple clients). I do not say that the project should use a relational DBMS to store data — and I don't know enough about documented-oriented persistent storage to have a meaningful opinion on it — but your argument against relational databases seems a bit weak to me .

> Here I got a question, how could those generated XML document (assuming we use this XML-Scaladoc patching tool mentioned in the introduction of this project) be merged if conflict exists? If I understand correctly, there will be a dedicated person for applying those generated XML file to Scala language code base. Is that so?

I imagine that the conflicts you mention are between changes done on the wiki and those done on source code by developers? I don't know for sure what should happen in this case. In fact, I think that finding the right balance between control and "crowd sourcing" is an interesting facet of this project. Having an expert — a recognised Scala developer — check the changes before committing them to source (and solving conflicts in the process) is an obvious and straightforward solution. But it may well be that a completely crowd-sourced approach, including for handling merge conflicts, ends up working better. I would like the student doing this project to explore such questions.

In any case, conflicts due to concurrent wiki edits should probably be handled by the wiki itself (either by preventing concurrent edits, or by handling conflicts when they arise in some way).



Here is my opinion about Gilles's reply: 
I am leaned to use Document Oriented Databases because I think those wiki comment do not have obvious relational scheme, and have to be stored as a whole using some key->value storage (key is the element being commented, value is the comments itself). This is where document oriented databases are good at. 
In this particular project, I'd like to use some optimistic current mechanism: support current modification, but when comments submitted, check the version number of each comments, if the version number equals to the current version number of wiki, this modification will be accept. Otherwise, just reject it. But I still have questions, whether this comments modifications need to be rendered immediately after it is submitted, or only render comments that have been merged into the source code. If it is the latter case, we can just store all the modifications on comments, but it will leave a huge work load to the guy merge this modification. 
Any comments are welcomed to this thread. Thank you very much. 

--
Sincerely yours and Best Regards,
Xie Xiaodong
Gilles Dubochet
Joined: 2010-01-28,
User offline. Last seen 42 years 45 weeks ago.
Re: About Google Summer of Code Project - Collaborative Scaladoc

> In this particular project, I'd like to use some optimistic current
> mechanism: support current modification, but when comments submitted,
> check the version number of each comments, if the version number equals
> to the current version number of wiki, this modification will be
> accept. Otherwise, just reject it. But I still have questions, whether
> this comments modifications need to be rendered immediately after it is
> submitted, or only render comments that have been merged into the
> source code. If it is the latter case, we can just store all the
> modifications on comments, but it will leave a huge work load to the
> guy merge this modification.

There would definitely be a problem with having modifications not be
rendered immediately. Applying a possibly large set of completely
uncoordinated changes to an original string seems to me all but
impossible.

My own ideas concerning the propagation of modifications is that there
should be two "views" of the documentation: the "social" view that is
the result of current social interactions (and is updated whenever
something changes, like a wiki), and a "static" view that is generated
from sources. From time to time, the social view (or relevant parts
thereof) is merged into sources to become part of the static view.

However, if Xie or other people on the list have other ideas, it would
be great to discuss them.

Cheers,
Gilles.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland