INDEX
Explanations
crisis text line and Trevor
New Auto-Interp
Negative Logits
udades
0.39
จุ
0.39
ricted
0.38
マ
0.35
ierd
0.35
Bacterial
0.35
ichel
0.35
क्लीन
0.35
etok
0.35
umna
0.34
POSITIVE LOGITS
Crisis
0.56
crisis
0.54
Crisis
0.54
Trevor
0.49
刑
0.46
treason
0.42
crises
0.42
tre
0.41
кризи
0.41
Tre
0.40
Activations Density 0.051%