SequentialRecommender¶
SequentialRecommender wraps a SequentialScorer + SASRec estimator and adds the sequence-building logic required for temporal interaction data: it sorts interactions by timestamp, groups by user, truncates long histories, and passes the resulting sequences downstream.
Overview¶
Owns:
- Sorting interactions by
TIMESTAMPper user - Truncating sequences to
max_len + 1 - Handling the presence/absence of
OUTCOME(training vs. inference)
Does not own:
- Tensor construction (estimator)
- Item vocabulary and score matrix (scorer)
Quick Start¶
from skrec.estimator.sequential.sasrec_estimator import SASRecClassifierEstimator
from skrec.scorer.sequential import SequentialScorer
from skrec.recommender.sequential.sequential_recommender import SequentialRecommender
from skrec.dataset.interactions_dataset import InteractionsDataset
from skrec.dataset.items_dataset import ItemsDataset
estimator = SASRecClassifierEstimator(
hidden_units=50, num_blocks=2, num_heads=1,
dropout_rate=0.2, epochs=200, max_len=50,
early_stopping_patience=5, # Optional: stop when val loss plateaus
restore_best_weights=True,
)
scorer = SequentialScorer(estimator)
recommender = SequentialRecommender(scorer, max_len=50)
# Train (with optional validation split for early stopping)
recommender.train(
items_ds=ItemsDataset(data_location="items.csv"),
interactions_ds=InteractionsDataset(data_location="interactions.csv"),
valid_interactions_ds=InteractionsDataset(data_location="valid.csv"), # Optional
)
# Recommend
recs = recommender.recommend(
interactions=recent_interactions_df, # USER_ID, ITEM_ID, TIMESTAMP
top_k=10,
)
Training Data Requirements¶
The interactions dataset must have at minimum:
| Column | Type | Required |
|---|---|---|
USER_ID |
str / int | ✅ |
ITEM_ID |
str / int | ✅ |
TIMESTAMP |
int / float | ✅ |
OUTCOME |
float | ✅ at training, optional at inference |
Provide the built-in schema to enable validation:
InteractionsDataset(
data_location="interactions.csv",
client_schema_path="skrec/dataset/required_schemas/interactions_schema_with_timestamp_training.yaml",
)
Validation Split and Early Stopping¶
Pass valid_interactions_ds to train() to enable per-epoch val-loss monitoring. The recommender builds validation sequences using the same _build_sequences logic applied to all interactions except the last per user — so the last target in each validation sequence is the validation item (the second-to-last interaction in the full history).
# Leave-last-two-out split
interactions["rank"] = interactions.groupby("USER_ID").cumcount(ascending=False)
train_df = interactions.drop(columns=["rank"]) # ALL interactions
valid_df = interactions[interactions["rank"] == 1] # Second-to-last per user
interactions_ds = InteractionsDataset(data_location="train.csv")
valid_inter_ds = InteractionsDataset(data_location="valid.csv")
recommender.train(
items_ds=items_ds,
interactions_ds=interactions_ds,
valid_interactions_ds=valid_inter_ds, # Enables early stopping in estimator
)
Validation data format
valid_interactions_ds uses the same raw interactions format as interactions_ds — one row per interaction with USER_ID, ITEM_ID, OUTCOME, TIMESTAMP. The recommender converts it to sequences internally.
Sequence Building¶
_build_sequences is the core transformation step:
- Sort all rows by
(USER_ID, TIMESTAMP)ascending - Group by
USER_ID— one row per user with a list of item IDs - Truncate to the most recent
max_len + 1items (one extra so the estimator can forminput = seq[:-1],target = seq[1:]with allmax_lenpositions filled for long histories) - Include
OUTCOME_SEQUENCEonly whenOUTCOMEis present (training mode)
# Training mode: OUTCOME present
sequences_df columns: USER_ID, ITEM_SEQUENCE, OUTCOME_SEQUENCE
# Inference mode: no OUTCOME
sequences_df columns: USER_ID, ITEM_SEQUENCE
max_len Parameter¶
max_len must match between SequentialRecommender and SASRecClassifierEstimator. The recommender always wins at training time — it syncs its value to the estimator and logs a warning if they differ:
# Correct: same max_len on both
estimator = SASRecClassifierEstimator(..., max_len=50)
recommender = SequentialRecommender(scorer, max_len=50)
# Also works: recommender value overrides silently with a warning
estimator = SASRecClassifierEstimator(..., max_len=99) # overridden to 50
recommender = SequentialRecommender(scorer, max_len=50)
Inference¶
At inference time, pass raw interaction rows — the same TIMESTAMP-ordered format as training. OUTCOME is optional.
# Score all items for each user (returns ndarray shape: n_users × n_items)
scores = recommender.score_items(interactions=recent_df)
# Recommend top-k items (returns ndarray shape: n_users × top_k)
recs = recommender.recommend(interactions=recent_df, top_k=10)
History window
Only the most recent max_len interactions are used per user at inference. Older interactions beyond max_len are silently dropped.
Probabilistic Sampling¶
Like RankingRecommender, you can sample from the score distribution instead of taking the argmax:
recs = recommender.recommend(
interactions=recent_df,
top_k=10,
sampling_temperature=0.5, # 0 = deterministic, >0 = probabilistic
)
When to Use SequentialRecommender¶
✅ Use SequentialRecommender when:
- Interaction order matters (video watch history, browsing sessions, purchase journeys)
- You have
TIMESTAMPdata for all interactions - Users have at least a few historical interactions
❌ Don't use SequentialRecommender when:
- No
TIMESTAMPavailable — use RankingRecommender with NCF - Interactions are independent and unordered
Comparison with RankingRecommender¶
| SequentialRecommender | RankingRecommender | |
|---|---|---|
| Order-aware | ✅ Yes | ❌ No |
| Requires TIMESTAMP | ✅ Yes | ❌ No |
| Compatible estimators | SASRec only | All estimators |
| User/item features | ❌ No (ID-based) | ✅ Yes |
| Inference input | Raw interactions + TIMESTAMP | User features or interactions |
Next Steps¶
- SASRec Estimator Guide — Architecture, hyperparameters, performance benchmarks
- HierarchicalSequentialRecommender — Session-aware alternative using HRNN
- Training Guide — General training patterns
- Evaluation Guide — Measuring ranking quality (HR@K, NDCG@K)