INDEX
Explanations
names and locations, particularly those involving sports
the presence of specific gerunds or verb forms ending in 'ing'
New Auto-Interp
Negative Logits
ihar
-0.76
nery
-0.75
ngth
-0.73
ably
-0.69
icles
-0.68
uba
-0.67
ophob
-0.66
igning
-0.65
iness
-0.65
idity
-0.65
POSITIVE LOGITS
tons
1.35
ham
1.17
HAM
1.13
ton
1.05
redients
0.97
Stones
0.96
Sands
0.93
edge
0.86
lass
0.85
haus
0.83
Activations Density 0.119%