INDEX
Explanations
phrases that relate to responses and connections in context
New Auto-Interp
Negative Logits
pleaſure
-0.98
Theſe
-0.98
Monfieur
-0.97
Diſ
-0.97
itſelf
-0.94
Jefus
-0.93
Chriftian
-0.92
houſe
-0.92
Efq
-0.90
purpoſe
-0.88
POSITIVE LOGITS
due
0.86
because
0.84
pursuant
0.81
μφωνα
0.76
debido
0.74
according
0.73
вслед
0.72
owing
0.68
dientemente
0.66
because
0.65
Activations Density 0.450%