This project has retired. For details please refer to its Attic page.

You can modify the default DataSource to read your custom properties or different Entity Type.

This explains how to add user defined properties to items returned by your engine. We add properties "title", "date" and "imdbUrl" for entity type "item".

You can find the complete modified source code here.

Note: you also need import events with these properties accordingly.

Modification

DataSource.scala

  • modify the Item parameters
  • modify how to create the Item object using the entity properties
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// MODIFIED
case class Item(
     title: String,
     date: String,
     imdbUrl: String,
     categories: Option[List[String]])

...

  override
  def readTraining(sc: SparkContext): TrainingData = {
    ...
    // create a RDD of (entityID, Item)
    val itemsRDD: RDD[(String, Item)] = PEventStore.aggregateProperties(
      appName = dsp.appName,
      entityType = "item"
    )(sc).map { case (entityId, properties) =>
      val item = try {
        // Assume categories is optional property of item.
        // MODIFIED
        Item(
          title = properties.get[String]("title"),
          date = properties.get[String]("date"),
          imdbUrl = properties.get[String]("imdbUrl"),
          categories = properties.getOpt[List[String]]("categories"))
      } catch {
        case e: Exception => {
          logger.error(s"Failed to get properties ${properties} of" +
            s" item ${entityId}. Exception: ${e}.")
          throw e
        }
      }
      (entityId, item)
    }.cache()

    ...
  }

Engine.scala

Modify the ItemScore parameters too.

1
2
3
4
5
6
7
8
// MODIFIED
case class ItemScore(
  item: String,
  title: String,
  date: String,
  imdbUrl: String,
  score: Double
) extends Serializable

ALSAlgorithm.scala

Modify how to create the ItemScore object using the properties.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
  def predict(model: ALSModel, query: Query): PredictedResult = {
    ...

    val itemScores = topScores.map { case (i, s) =>
      // MODIFIED
      val it = model.items(i)
      ItemScore(
        item = model.itemIntStringMap(i),
        title = it.title,
        date = it.date,
        imdbUrl = it.imdbUrl,
        score = s
      )
    }

    ...
  }

Using model.items(i) you can receive corresponding object of the Item class, and now you can access its properties which you created during previous step.

Test the Result

Then we can build/train/deploy the engine and test the result:

The query

1
2
3
$ curl -H "Content-Type: application/json" \
-d '{ "items": ["i1"], "num": 4 }' \
http://localhost:8000/queries.json

will return the result

1
2
3
4
5
6
7
8
{
  "itemScores":[
    {"item":"i3","title":"title for movie i3","date":"1947","imdbUrl":"http://imdb.com/fake-url/i3","score":0.5865418718902017},
    {"item":"i44","title":"title for movie i44","date":"1941","imdbUrl":"http://imdb.com/fake-url/i44","score":0.5740199916714374},
    {"item":"i37","title":"title for movie i37","date":"1940","imdbUrl":"http://imdb.com/fake-url/i37","score":0.5576820095310056},
    {"item":"i6","title":"title for movie i6","date":"1947","imdbUrl":"http://imdb.com/fake-url/i6","score":0.45856345689769473}
  ]
}

That's it! Your engine can return more information.