INDEX
Explanations
instances of the letter 'F'
New Auto-Interp
Negative Logits
comm
-0.17
aff
-0.15
unb
-0.15
OTAL
-0.14
anvas
-0.14
emony
-0.14
pios
-0.14
ány
-0.14
Somebody
-0.14
cus
-0.14
POSITIVE LOGITS
razier
0.36
uchs
0.33
ischer
0.32
letcher
0.32
enton
0.31
erguson
0.31
inkel
0.30
oster
0.30
owler
0.30
isher
0.29
Activations Density 0.031%