Recommender Systems: II. Factorization Machine
Factorization Machine
1. Definition
- In essence, a generalized
matrix factorization
method Field
: A type/column in the original datasetFeature
: A value in the Field (Nike is a feature, Brand is a field)- Movitation:
- Traditional regression methods cannot handle sparse matrix very well (too much waste in computation time on null values)
- FM solves the problem of considering pairwise feature interactions (linear time complexity).
- It allows us to train, based on reliable information (latent features) from every pairwise combination of features in the model.
- Main logic:
- Instead of using field as column, each feature has a column.
- So the columns are basically one-hot-encoding for each value in the field and the row is user id
- each row covers all the information a user has
- log-loss function to minimize:
w_i
: feature parameter vector (to be optimized)x_i
: feature vector (column, given)v_i
: latent vector of predefined low dimension k (to be optimized)
- The idea here is that except for individual feature, it consider the combination of 2 features (hence a degree = 2) as a factor
- Extension:
Field-aware FM
- For each feature, the parameter vector is no longer unique
- A feature may interact with other features with different fields. Hence we differentiate the parameter vector for a feature based on the field of its interacting feature
- E.g:
Gaming is an activity, Make is a gender; The parameter vector for Male may also be a if it is interacting with a brand like Nike
- Important note on numerical features
- Numerical features either need to be discretized (transformed to categorical features by breaking the entire range of a particular numerical feature into smaller ranges and label encoding each range separately).
- Another possibility is to add a dummy field which is the same as feature value will be numeric feature for that particular row (For example a feature with value 45.3 can be transformed to 1:1:45.3). However, the dummy fields may not be informative because they are merely duplicates of features.
2. Code Sample
- Note that the code below will faill because we haven’t installed the xlearn package (too tedious)
- Refer to the code to get an inspiration
- Only apply the code if you have the need to use FM or FFM in your model
- Note that usually DL method works better for the FM-integrated recommender
1 | import pandas as pd |
Next, we need to convert the dataset to libffm format which is necessary for xLearn to fit the model. Following function does the job of converting dataset in standard dataframe format to libffm format. df = Dataframe
to be converted to ffm format
- Type = 'Train' / 'Test'/ 'Valid'
- Numerics = list of all numeric fields
- Categories = list of all categorical fields
- Features = list of all features except the Label and Id
1 | def convert_to_ffm(df,type,numerics,categories,features): |
the xLearn library can handle csv as well as libsvm format for implementation of FMs while we necessarily need to convert it to libffm format for using FFM. Once we have the dataset in libffm format, we could train the model using the xLearn library. xLearn can automatically performs early stopping using the validation/test logloss and we can also declare another metric and monitor on the validation set for each iteration of the stochastic gradient descent.
1 | ffm_model = xl.create_ffm() |
Recommender Systems: II. Factorization Machine
https://criss-wang.github.io/post/blogs/recom_sys/recommender-2/