INDEX
Explanations
proper nouns
mentions of the name "Ron."
New Auto-Interp
Negative Logits
rigging
-0.66
LER
-0.64
ready
-0.62
cy
-0.61
seeded
-0.59
sympathetic
-0.58
chance
-0.57
warrants
-0.56
warrant
-0.56
enegger
-0.55
POSITIVE LOGITS
ald
1.19
aldo
1.17
nie
0.99
ny
0.96
Swanson
0.91
nen
0.89
ni
0.88
Paul
0.88
imo
0.86
Weasley
0.85
Activations Density 0.024%