INDEX
Explanations
references to foundational principles or criteria underlying a topic
New Auto-Interp
Negative Logits
purpoſe
-0.93
Monfieur
-0.88
pleaſure
-0.85
Efq
-0.81
houſe
-0.80
itſelf
-0.80
Jefus
-0.80
Chriftian
-0.78
faſt
-0.78
habet
-0.77
POSITIVE LOGITS
dientemente
0.78
μφωνα
0.75
исленность
0.62
aufgrund
0.61
Portail
0.61
]<<
0.61
través
0.59
учетом
0.59
makeText
0.57
of
0.56
Activations Density 0.636%