INDEX
Explanations
phrases expressing opinions or ratings
New Auto-Interp
Negative Logits
řeba
-0.58
otomatig
-0.56
المعيارى
-0.48
Duty
-0.46
duty
-0.46
<()>
-0.46
:“……”
-0.46
elter
-0.45
antMatchers
-0.45
⊂
-0.45
POSITIVE LOGITS
rating
0.83
✭✭
0.70
stars
0.69
rating
0.67
RATING
0.65
RATING
0.64
Rating
0.63
Rating
0.62
ratings
0.61
faute
0.60
Activations Density 0.212%