INDEX
Explanations
instances of the word "review" and its derivatives in various contexts
New Auto-Interp
Negative Logits
któ
-0.78
Alba
-0.68
Laughs
-0.66
HHHHHHHH
-0.66
getN
-0.64
Piet
-0.63
🤣🤣
-0.62
ματο
-0.62
Schulte
-0.61
-0.60
POSITIVE LOGITS
Review
1.88
review
1.86
review
1.82
reviews
1.80
REVIEW
1.79
Review
1.79
Reviews
1.74
Reviews
1.72
REVIEW
1.71
reviewer
1.63
Activations Density 0.090%