INDEX
Explanations
proper names, specifically the name "Thomas" and its variations
references to the name "Thomas."
New Auto-Interp
Negative Logits
raq
-0.73
âĶĢâĶĢâĶĢâĶĢ
-0.64
bull
-0.64
arena
-0.62
forums
-0.61
eye
-0.60
grill
-0.59
rooms
-0.59
shaft
-0.59
inbox
-0.59
POSITIVE LOGITS
Thomas
3.33
Thomas
1.78
Willis
1.09
Francois
1.06
Tom
1.05
Thompson
1.01
Thom
1.00
Ludwig
0.97
Amos
0.96
William
0.95
Activations Density 0.016%