Social Web Community Group Teleconference

11 Oct 2017

Attendees

Present: ajordan, aaronpk, cwebber, melody, puckipedia
Regrets Chair: cwebber
Scribe: puckipedia, cwebber

cwebber: a few things on the agenda, next one is activitypub's mediaUpload endpoint. maybe we should shuffle that to the end
... let's talk about anti-abuse tooling
... generally the topic is letting users be protected against abuse, like a gamergate pile-on. we do have some things in the existing protocols and applications. Mastodon has a few cross-server filtering tools, we have a block tool in AP, don't know if there's something in webmention, aaronpk, is there something for filtering?

aaronpk: it's out of scope. it's up to the receiver to decide what to do with it. webmention just gets you to "yes, this thing exists", not about who it is or whether it is usefule

cwebber: so, melody, I'm volunteering you, melody has been doing some work in the anti-abuse tooling space

melody: the work i've done so far isn't directly related to this

cwebber: I know it's a bit on the stop, but I actively encouraged her to be on the call, because her work is invested in these spaces

melody: I have a continued investment in this area, I'm still trying to get my bearings around what level of tooling, and like, which protocols are relevant to the ideas I have for anti-abuse in the social web
... I'm having trouble contextualizing where the stuff I'm doing fits in the big picture

cwebber: it's good to hear your perspective from this space, what would be useful is to kind of list out the possible avenues for anti-abuse tooling. this is something we're not going to resolve on this call
... I don't think we need to use a queue, let's just get listed out the general avenues that we know

<cwebber> - Blocklists

<cwebber> - Block "activity"

<cwebber> - Word filtering

cwebber: one other route might be word filtering, specifically a blacklist of words you don't want to see, that starts to move down the direction of bayesian spam filtering

<cwebber> - Bayesian inference

cwebber: instead of explicitly listing what you don't want, ahve a bayesian filter compile what you want based on what you mark
... this inference might slowly move to more neural-type approaches, unfortunately these involve having a very large corpus, a local bayesian filter doesn't involve a lot of work
... the risk is that this can falsely flag people

<cwebber> - neural networks

cwebber: in AP and webmention, each has *a* public endpoint anyone can send a message to, whether to reply to a post, or to a user there's an endpoint to send to. another option is to instead have a web of trust of people you are allowing in. so a whitelist, and you might look at the list of people you trust and see who *they* trust

<cwebber> - web of trust

cwebber: maybe we can say well, our public inbox is only for people we trust, but maybe there's another inbox that is for anyone, which would require a payment mechanism, so if you're not in my trust network, you can send a message but it costs 5 cents
... this makes spam useless, because it costs people to spam people, and makes dogpiling expensive
... if, it turns out "oh, this is someone I hadn't marked", you can refund it
... downside is that it can incorporate more pay-to-play

<cwebber> scribenick: cwebber

puckipedia: I also have some thoughts for anti-abuse
... for example in mastodon a block is sent to the corresponding server which of course can be controlled by that person
... in activitypub it's kept kind of implementation-specified whether it's sent
... going to come back to this potentially
... I just thought of a flag activity, for example if you think that person is malicious you send a flag activity, but you send a flag activity not delivered to that actor, for example a flag activity to a malicious user on a non-malicious server

<puckipedia> scribenick: puckipedia

cwebber: a way this could be done is, the actor has an endpoints property, which is for server-wide shared endpoints
... so those are not specific to the user, but to the server. something which was suggested was, well, couldn't we have a server-wide actor, in the endponits, where you could deliver the flag activity

<cwebber> scribenick: cwebber

puckipedia: I think that would be nice because you could say it's the server admin, so you could send a message to the server admin, so you could send a message to that actor as well
... maybe that actor could also be a public announcement system. For instance, if thes server sends out announcements it's from that actor

<puckipedia> scribenick: puckipedia

cwebber: there is a generalized topic I'd like to discuss, one thing we've talked about except for possibly neural network, can be federated. and maybe even user-run
... so it's possible a user is in control of their blocklist, and a user can even run, client-side, their own filtering tools. a concern is, sometimes when anti-abuse tooling comes up, will you need a large centralized node?
... we need to talk about this, there was more of a push a year ago, when there was more faith from people who are concerned with protecting people for social justice reasons, that these neural network types will protect people
... I've seen a shift that people have become more suspicious. we've seen these networks pick up institutional racist/sexist biases of the system
... are there reasons why a centralized system can do things a decentralized system can't do?

ajordan: so I will fully admit to not having followed this discussion, but you just mentioned running things client-side. I would like to say that is, in my view, unideal. client-side solutions only work for one client, so you get an inconsistent timeline across client, and you still have to pay the cost of network traffic before you can filter it out
... I really would like not to push this to the client

<ajordan> also the other thing I should've said is that the server almost always has more information than the client

melody: I agree with ajordan, if you have a keyword filter list you want those to be in place no matter which client. it ends up being pretty important that those things work. coming from a tumblr background, there's a few good browser extensions that have keyword filtering with differing sensitivity, but it falls flat if you use it on mobile that you wouldn't have seen on the desktop.

<ajordan> it knows what IP addresses activities are being delivered from, it can authenticate/interrogate other servers more thoroughly, and it knows what other users might have flagged as spam

melody: I don't know if you need deep centralization, but one thing that hasn't been discussed is allowing some method of assessing risk of a message, you could use some kind of algorithm like a spam filter, or base it on keywords, or based on someone that isn't on your web of trust. carrying this info with the message would give clients the ability to surface messages with different kinds of risk with a customized amount of risk at any given moment when

you're interacting with your timeline/inbox/whatever

cwebber: we got consensus that we don't want disjointness between clients. we want ability, on the client-server level, that they all share the same data and don't get out of sync. the other point seems to be wrapping things in an envelope, so you get the message, and the server wraps it in an envelope with a level of risk

ajordan: I am not convinced this is not dangerously close to full centralization, but I think it would be interesting to, instead of standardising these anti-abuse things directly, if we standardise a way for a client to say to a server "please send this post to this other service for me" and the service is allowed to, either add stuff to a post (risk assessment), or, y'know, reject it entirely
... if we did that, servers wouldn't have to worry about this as much, and you could iterate on anti-abuse protocols independently. and servers don't have to reimplement it, and you'd get pretty wide-spread good tooling for free

cwebber: I agree that this simplifies things for implementors of AP servers. If we ask someone "we'd love to see more AP implementations", you can implement the server isntead of also adding the other complex tooling. this is a strong advantage, and we've got aggregration of information
... but it effectively irradicates privacy from the system, and that might be a large point of concern, this could be a target for state actors, malicious actors who would like to see private posts, or even do the reverse, target it to their own needs
... but it also has upsides in simplifying things

<ajordan> https://github.com/e14n/activityspam

<Loqi> [e14n] activityspam: Bayesian spam filter for activitystrea.ms data

cwebber: activityspam was a server that did spam filtering for you, at the costs of what we just discussed

<cwebber> q:

<Zakim> ajordan, you wanted to clarify "eradicates privacy"

ajordan: you say it eradicates privacy, can you clarify?

cwebber: the concern is that, say, I send a message to you directly. then we have some anti-spam central server, and you and I both run private nodes. We have the impression that I'm sending something directly and thus it's a private channel. in fact, this other server, the anti-spam server, would see the communication. you would have the level of surveillance Google has over Gmail servers, which is equivalnt to the privacy if you use the anti-spam service
... it opens the possibility for centralized surveillance

ajordan: even if we run private nodes, I could still send the message to others

cwebber: by encouraging many nodes to use a centralized service we open the door to, e.g. activityspam is running out of money, "oh hey, we could just sell our data to an advertising company"
... but we're encouraging a dangerous centralized service

melody: just thinking about the other side, how breaking it off into a separate service might help iterating, if there isn't client side support for having to deal with the messages. if this is beign iterated on separately then only clients that know about the other service will know what to do with the information

cwebber: either we say accept/reject, or we need to standardize the envelope
... anything else somebody would want to talk about?

<ajordan> melody: no external services would be allowed to drop messages on the floor so you could decide whether you wanted "risk annotations" or drops based on what your clients support

melody: it's sort-of related to something else I know has been discussed, which is the sensitive flag, which I think was mostly discussed in the context of content warnings. I'm wondering if there's a possibility for using that for anti-abuse purposes as well, having a standardized way of "hide this by default" sounds like it could be a useful tool
... but it might require some additional information to be useful for that case, like a [??] message or some other information to go with it

cwebber: the sensitive flag, we adopted it because it's what Mastodon uses, there's both a sensitive flag and a content warning system, the CW system allows you to add a human-readable description
... the sensitive flag is more a generalized "not safe for work/whatever" type flag, and but it doesn't give context what that means
... it's like actor/server-specific to make that decision
... another route that isn't implemented in Mastodon or the spec was having tags that are themselves marked as sensitive. this would allow users to set more careful filtering, maybe a user is fine seeing steven universe spoilers, but they don't want to see pornography or graphic violence. maybe another person is fine with pornography but not okay with those other two
... maybe someone doesn't want to see politics by default
... this would be more closely related to content warning as a user-typed field and tags combined. the upside is that you're more precise, the downside is that people might not actually do it

melody: that's a lot of what I had in mind, like, users on tumblr are fairly happy with managing a list of things they don't want to see, it seems the ability to state the message being collapsed by default with a reason, then use for any of these approaches
... you could use it by having that reason be typed in by the creator, and then trip the flag, or you could have it be on the recipient end like a user of tumblr and have the server decide to add the reason and the flag
... so that flag plus reason is common to both of those approaches

<Zakim> cwebber, you wanted to talk about danger of "sensitive" boolean

<ajordan> it occurs to me that earlier basically I was describing a superpowered version of milters

cwebber: the reason the sensitive flag is an extension in AP instead of in the AS spec is there's a lot of concerns of a per-post boolean with a flag, we've seen problems. on youtube, marking stuff that was meant for LGBT, not even sexually explicit, being marked as NSFW
... we've seen this with reporting on war crimes etc etc that might not be graphically explicit, but gets marked
... so if we go for boolean, we should carry contextual information with it

<ajordan> so not "just" a boolean ;)

cwebber: a good next step is to capture what we said on a wiki page, what we think is the future direction for this
... if there's any volunteers, I'd rather not do this alone

<ajordan> I don't want to volunteer and then not have the time :/

melody: I might be able to help with that in the near future, I'm pretty busy this week

<ajordan> ^^^ same

cwebber: melody, would you want to collaborate with this starting next month?

melody: I think that's doable

<cwebber> PROPOSED: Have this meeting extend to 90 minutes rather than 60

<cwebber> +1

<ajordan> -0 by all means go for it but I have to go right on the hour

+0

<aaronpk> -0

<melody> +0

cwebber: it doesn't seem there's any strong exuberance to keep going an hour and a half, so we'll push the topics to next week. I feel like the follower migration is a whole meeting's worth of topic, and so is publishing which extensions are used by a server
... so AP had a mediaUpload endpoint, which allowed you to upload images/video/whatever, there were a few aspects we were unsure, like chunked uploads, if uploading a post should Create it to your outbox immediately
... or whether or not you should have an object you can attach to other objects

<cwebber> https://github.com/w3c/activitypub/issues/239#issuecomment-335548123

<Loqi> [cwebber] ```

<Loqi> <eprodrom> RESOLVED: Resolve https://github.com/w3c/activitypub/issues/239 by

<Loqi> removing mediaUpload and specified behavior from ActivityPub spec

<Loqi> proper, move to extension via So...

cwebber: we had an issue in which some proposals happened
... still, it was not quite resolved, adn we're coming to the end point of activitypub. the mediaUpload endpoint was marked at risk, because it was the least-tested thing. I feel sad we're not getting it in, we will move it to the community group as an extension
... this is something we need and are going to get in anyways (mediaGoblin needs it and it's my project), but effectively mediaUpload endpoint is SocialCG now
... in that case, we've got 5 minutes left, I feel this was really productive, thank you melody for participating, and thank you all as usual

<ajordan> cwebber++ for chairing

<Loqi> cwebber has 28 karma

<ajordan> puckipedia++ for scribing :)

<Loqi> puckipedia has 13 karma

<cwebber> puckipedia++ for scribing indeed!

<Loqi> puckipedia has 14 karma

<cwebber> trackbot, end meeting

SocialCG/2017-10-11/minutes

Social Web Community Group Teleconference

11 Oct 2017

Attendees

Contents

Summary of Action Items

Summary of Resolutions