INDEX
Explanations
"Hi" followed by name or punctuation
New Auto-Interp
Negative Logits
프
0.86
hình
0.83
חק
0.83
今日
0.81
בת
0.80
প্র
0.79
dokument
0.79
코
0.78
но
0.77
פר
0.77
POSITIVE LOGITS
fellow
0.82
tind
0.80
Mr
0.80
क्षेत्र
0.77
anf
0.77
fears
0.76
tener
0.76
mals
0.74
tendency
0.73
fais
0.73
Activations Density 0.015%