Child pages
  • Retiring RCP-based Carrot2 Workbench

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • DCS as common backend. Make the DCS the common backend for both Carrot2 webapp and Carrot2 Workbench (quick start, tuning, experimentation app). The benchmarking view from current Workbench could be implemented as another app, but this would include the network overhead, which may or may not be what the user needs.
  • User apps. Make it possible to add more user-created directories to be served through the DCS, so that users can develop their own apps.
  • Remove document sources? A bit revolutionary, but maybe we should remove document sources from the core Carrot2 API and instead add them to the DCS? For maximum decoupling each document source could expose some extra end point in the DCS, which the client-side app would query to get the data and then feed for clustering. (The reality is that most practical DCS use cases upload custom data for clustering.) This doubles the amount of data pushed back and forth (and uplinks are usually slower), but maybe it's worth the simplicity we'd gain (no need for caching in controllers, smaller JARs, smaller docs)? For certain document sources, the client-side app could fetch the data on its own (with user's IP), which is an added benefit. 
    • Question: is it possible to implement a Lucene document source in this model (where the user provides a path to the Lucene index in the client-side app)? Ideally the DCS-based sources should be stateless and Lucene source would require some init (opening index).
  • The explicit URL field in documents is misleading, maybe we should go with an arbitrary list of fields and weights, also indicating which fields should be used for clustering?
  • Retire / remove .NET API