INDEX
Explanations
words related to physical fitness or exercise
references to specific people, particularly those associated with events or incidents
New Auto-Interp
Negative Logits
~~~~~~~~
-0.82
~~~~~~~~~~~~~~~~
-0.78
EMP
-0.73
CHAT
-0.71
[|
-0.69
rt
-0.67
Freak
-0.67
Ended
-0.66
plugins
-0.66
Ò
-0.64
POSITIVE LOGITS
sburgh
1.01
å§«
0.94
s
0.87
entary
0.85
sburg
0.82
ular
0.81
shire
0.79
enza
0.78
shaw
0.77
ication
0.77
Activations Density 0.031%