INDEX
Explanations
instances where someone makes contact or tries to communicate with others for various reasons
instances of reaching out to others for communication or support
New Auto-Interp
Negative Logits
Tsukuyomi
-0.82
cake
-0.70
ardo
-0.65
CARD
-0.61
antry
-0.59
Kats
-0.59
oxidation
-0.58
PET
-0.58
Syd
-0.58
Kru
-0.58
POSITIVE LOGITS
stretched
1.04
doors
0.88
wards
0.80
posts
0.74
reach
0.73
door
0.73
reprene
0.72
worm
0.70
spr
0.69
reb
0.69
Activations Density 0.021%