INDEX
Explanations
terms related to financial transactions and regulations
New Auto-Interp
Negative Logits
man
-0.51
sim
-0.51
variety
-0.49
@
-0.49
young
-0.49
tés
-0.49
talking
-0.48
x
-0.48
men
-0.48
果
-0.48
POSITIVE LOGITS
purpoſe
1.11
fubject
1.05
reaſon
1.02
ſtate
0.99
myſelf
0.96
pleaſure
0.95
himſelf
0.93
ſhall
0.91
cauſe
0.90
itſelf
0.90
Activations Density 0.350%