INDEX
Explanations
sentences related to personal opinions and reflections
emotional expressions and opinions about relationships
New Auto-Interp
Negative Logits
eatured
-0.51
urst
-0.50
respective
-0.48
incumbent
-0.47
è£ħ
-0.47
atories
-0.46
controvers
-0.46
odge
-0.45
idespread
-0.45
geoning
-0.45
POSITIVE LOGITS
fuckin
0.87
fucking
0.85
goddamn
0.74
..."
0.70
stupid
0.66
fucked
0.66
bitch
0.66
shitty
0.66
shit
0.64
godd
0.63
Activations Density 3.005%