INDEX
Explanations
phrases indicating strong opinions or critiques about video games
New Auto-Interp
Negative Logits
лаÑĪ
-0.15
èĥĨ
-0.14
locs
-0.14
ableView
-0.14
achinery
-0.14
vanced
-0.14
.dsl
-0.13
abet
-0.13
Suspension
-0.13
reesome
-0.13
POSITIVE LOGITS
release
0.17
biên
0.16
released
0.16
Released
0.15
Released
0.15
iena
0.15
release
0.15
released
0.15
Release
0.15
releases
0.15
Activations Density 0.177%