INDEX
Explanations
exclamatory expressions of strong emotions such as happiness, excitement, and admiration
intensifiers expressing strong emotions or feelings
New Auto-Interp
Negative Logits
lihood
-0.71
istrates
-0.64
Advantage
-0.63
Peninsula
-0.63
strand
-0.62
ership
-0.61
istrate
-0.61
coerc
-0.60
aviour
-0.59
expectancy
-0.59
POSITIVE LOGITS
ooo
1.56
oooo
1.50
oooooooo
1.36
oooooooooooooooo
1.20
bered
1.19
apy
1.08
oo
1.06
ppy
1.02
glad
0.97
goddamn
0.94
Activations Density 0.089%