INDEX
Negative Logits
ture
0.91
ين
0.91
times
0.90
t
0.86
ان
0.81
een
0.81
timestamp
0.80
nge
0.79
trigger
0.78
penguin
0.78
POSITIVE LOGITS
rights
1.02
gotta
0.98
Rights
0.95
wages
0.87
sake
0.86
prerogative
0.85
remorse
0.85
suffrage
0.84
salaries
0.83
POV
0.82
Activations Density 0.060%