Dr. Johnnatan Messias

Selected Works and Publications

For a full list of peer-reviewed conference and journal publications, kindly check my Google Scholar or DBLP Profile.

Airdrops: Giving Money Away Is Harder Than It Seems
Pre print.

Abstract: Airdrops are used by blockchain applications and platforms to attract an initial user base, and to grow the user base over time. In the case of many airdrops, tokens are distributed to select users as a "reward" for interacting with the underlying platform, with a long-term goal of creating a loyal community that will generate genuine economic activity well after the airdrop has been completed. Although airdrops are widely used by the blockchain industry, a proper understanding of the factors contributing to an airdrop's success is generally lacking. In this work, we outline the design space for airdrops, and specify a reasonable list of outcomes that an airdrop should ideally result in. We then analyze on-chain data from several larger-scale airdrops to empirically evaluate the success of previous airdrops, with respect to our desiderata. In our analysis, we demonstrate that airdrop farmers frequently dispose of the lion's share of airdrops proceeds via exchanges. Our analysis is followed by an overview of common pitfalls that common airdrop designs lend themselves to, which are then used to suggest concrete guidelines for better airdrops.

Understanding Blockchain Governance: Analyzing Decentralized Voting to Amend DeFi Smart Contracts
Pre print.

Abstract: Smart contracts are contractual agreements between participants of a blockchain, who cannot implicitly trust one another. They are software programs that run on top of a blockchain, and we may need to change them from time to time (e.g., to fix bugs or address new use cases). Governance protocols define the means for amending or changing these smart contracts without any centralized authority. They distribute instead the decision-making power to every user of the smart contract: Users vote on accepting or rejecting every change. The focus of this work is to evaluate whether, how, and to what extent these protocols ensure decentralized governance, the fundamental tenet of blockchains, in practice. This evaluation is crucial as smart contracts continue to transform our key, traditional, centralized institutions, particularly banking and finance. In this work, we review and characterize decentralized governance in practice, using Compound -- one of the widely used governance protocols -- as a case study. We reveal a high concentration of voting power in Compound: 10 voters hold together 57.86% of the voting power. Although proposals to change or amend the protocol (or, essentially, the application they support) receive, on average, a substantial number of votes (i.e., 89.39%) in favor, they require fewer than three voters to obtain 50% or more votes. We show that voting on Compound governance proposals can be unfairly expensive for small token holders, and also discover voting coalitions that can further marginalize these users. We plan on publishing our scripts and data set on GitHub to support reproducible research.

The Writing is on the Wall: Analyzing the Boom of Inscriptions and its Impact on EVM-compatible Blockchains
In Proceedings of the 4th International Workshop on Cryptoasset Analytics (CAAW). 2025.

Abstract: This paper examines inscription-related transactions on Ethereum and major EVM-compatible rollups, assessing their impact on scalability during transaction surges. Our results show that, on certain days, inscriptions accounted for nearly 90% of transactions on Arbitrum and ZKsync Era, while 53% on Ethereum, with 99% of these inscriptions involving meme coin minting.Furthermore, we show that ZKsync and Arbitrum saw lower median gas fees during these surges. ZKsync Era, a ZK-rollup, showed a greater fee reduction than the optimistic rollups studied—Arbitrum, Base, and Optimism.

A Public Dataset For the ZKsync Rollup
In Proceedings of the 4th International Workshop on Cryptoasset Analytics (CAAW). 2025.

Abstract: Despite blockchain data being publicly available, practical challenges and high costs often hinder its effective use by researchers, thus limiting data-driven research and exploration in the blockchain space. This is especially true when it comes to Layer-2 (L2) ecosystems, and ZKsync, in particular. To address these issues, we have curated a dataset from~1 year of activity extracted from a ZKsync Era archive node and made it freely available to external parties. We provide details on this dataset and how it was created, showcase a few example analyses that can be performed with it, and discuss some future research directions.

Liquid Staking Tokens in Automated Market Makers
In Proceedings of the Mathematical Research for Blockchain Economy (MARBLE), Malaga, Spain, July 2024.

Abstract: This paper studies liquid staking tokens (LSTs) on automated market makers (AMMs), both theoretically and empirically. LSTs are tokenized representations of staked assets on proof-of-stake blockchains. First, we model LST-liquidity on AMMs theoretically, categorizing suitable AMM types for LST liquidity and deriving formulas for the necessary returns from trading fees to adequately compensate liquidity providers under the particular price trajectories of LSTs. For the latter, two relevant metrics are considered: (1) losses compared to holding the liquidity outside the AMM (loss-versus-holding, or "impermanent loss"), and (2) the relative profitability compared to fully staking the capital (loss-versus-staking) which is specifically tailored to the case of LST-liquidity. Next, we empirically measure these metrics for Ethereum LSTs across the most relevant AMM pools. We find that, while trading fees often compensate for impermanent loss, fully staking is more profitable for many pools, raising questions about the sustainability of the current LST liquidity allocation to AMMs.

Quantifying Arbitrage in Automated Market Makers: An Empirical Study of Ethereum ZK Rollups
In Proceedings of the Mathematical Research for Blockchain Economy (MARBLE), Malaga, Spain, July 2024.

Abstract
PDF

Abstract: Arbitrage opportunities arise from the simultaneous buying and selling of the same asset in different markets to profit from price differences. This research systematically explores arbitrage possibilities between Automated Market Makers (AMMs) on Ethereum zk-rollups and Centralized Exchanges (CEXs). To start, we introduce a theoretical framework to assess arbitrage opportunities and develop a formula for the Maximal Arbitrage Value (MAV), considering both price discrepancies and the liquidity present in trading venues. Following this, we conduct an empirical assessment of the historical MAV between SyncSwap, an AMM on zkSync Era, and Binance, examining the speed at which price discrepancies are resolved considering both explicit and implicit market costs. Overall, the total MAV from July to September 2023 in the USDC-ETH SyncSwap pool amounts to $104.96K (0.24% of trading volume).

Layer-2 Arbitrage: An Empirical Analysis of Swap Dynamics and Price Disparities on Rollups
Pre print.

Abstract
PDF

Abstract: This paper explores the dynamics of Decentralized Finance (DeFi) within the Layer-2 ecosystem, focusing on Automated Market Makers (AMM) and arbitrage on Ethereum rollups. We observe significant shifts in trading activity from Ethereum to rollups, with swaps on rollups happening 2-3 times more often, though, with lower trade volume. By examining the price differences between AMMs and centralized exchanges, we discover over 0.5 million unexploited arbitrage opportunities on rollups. Remarkably, we observe that these opportunities last, on average, 10 to 20 blocks, requiring adjustments to the LVR metrics to avoid double-counting arbitrage. Our results show that arbitrage in Arbitrum, Base, and Optimism pools ranges from 0.03% to 0.05% of trading volume, while in zkSync Era it oscillates around 0.25%, with the LVR metric overestimating arbitrage by a factor of five. Rollups offer not only lower gas fees, but also provide faster block production, leading to significant differences compared to the trading and arbitrage dynamics of Ethereum.

Dissecting Bitcoin and Ethereum Transactions: On the Lack of Transaction Contention and Prioritization Transparency in Blockchains
In Proceedings of the Financial Cryptography and Data Security (FC 2023). Bol, Brač, Croatia. May, 2023.

Abstract: In permissionless blockchains, transaction issuers include a fee to incentivize miners to include their transaction. To accurately estimate this prioritization fee for a transaction, transaction issuers (or blockchain participants, more generally) rely on two fundamental notions of transparency, namely contention and prioritization transparency. Contention transparency implies that participants are aware of every pending transaction that will contend with a given transaction for inclusion. Prioritization transparency states that the participants are aware of the transaction or prioritization fees paid by every such contending transaction. Neither of these notions of transparency holds well today. Private relay networks, for instance, allow users to send transactions privately to miners. Besides, users can offer fees to miners via either direct transfers to miners' wallets or off-chain payments---neither of which are public. In this work, we characterize the lack of contention and prioritization transparency in Bitcoin and Ethereum resulting from such practices. We show that private relay networks are widely used and private transactions are quite prevalent. We show that the lack of transparency facilitates miners to collude and overcharge users who may use these private relay networks despite them offering little to no guarantees on transaction prioritization. The lack of these transparencies in blockchains has crucial implications for transaction issuers as well as the stability of blockchains. Finally, we make our data sets and scripts publicly available.

Selfish & Opaque Transaction Ordering in the Bitcoin Blockchain: The Case for Chain Neutrality
In Proceedings of the ACM SIGCOMM Internet Measurement Conference (IMC 2021). Virtual Event. November, 2021.

Abstract: Most public blockchain protocols, including the popular Bitcoin and Ethereum blockchains, do not formally specify the order in which miners should select transactions from the pool of pending (or uncommitted) transactions for inclusion in the blockchain. Over the years, informal conventions or "norms" for transaction ordering have, however, emerged via the use of shared software by miners, e.g., the GetBlockTemplate (GBT) mining protocol in Bitcoin Core. Today, a widely held view is that Bitcoin miners prioritize transactions based on their offered "transaction fee-per-byte." Bitcoin users are, consequently, encouraged to increase the fees to accelerate the commitment of their transactions, particularly during periods of congestion. In this paper, we audit the Bitcoin blockchain and present statistically significant evidence of mining pools deviating from the norms to accelerate the commitment of transactions for which they have (i) a selfish or vested interest, or (ii) received dark-fee payments via opaque (non-public) side-channels. As blockchains are increasingly being used as a record-keeping substrate for a variety of decentralized (financial technology) systems, our findings call for an urgent discussion on defining neutrality norms that miners must adhere to when ordering transactions in the chains. Finally, we make our data sets and scripts publicly available.

Modeling Coordinated vs. P2P Mining: An Analysis of Inefficiency and Inequality in Proof-of-Work Blockchains
.

Abstract
ArXiv

Abstract: We study efficiency in a proof-of-work blockchain with non-zero latencies, focusing in particular on the (inequality in) individual miners' efficiencies. Prior work attributed differences in miners' efficiencies mostly to attacks, but we pursue a different question: Can inequality in miners' efficiencies be explained by delays, even when all miners are honest? Traditionally, such efficiency-related questions were tackled only at the level of the overall system, and in a peer-to-peer (P2P) setting where miners directly connect to one another. Despite it being common today for miners to pool compute capacities in a mining pool managed by a centralized coordinator, efficiency in such a coordinated setting has barely been studied. In this paper, we propose a simple model of a proof-of-work blockchain with latencies for both the P2P and the coordinated settings. We derive a closed-form expression for the efficiency in the coordinated setting with an arbitrary number of miners and arbitrary latencies, both for the overall system and for each individual miner. We leverage this result to show that inequalities arise from variability in the delays, but that if all miners are equidistant from the coordinator, they have equal efficiency irrespective of their compute capacities. We then prove that, under a natural consistency condition, the overall system efficiency in the P2P setting is higher than that in the coordinated setting. Finally, we perform a simulation-based study to demonstrate that even in the P2P setting delays between miners introduce inequalities, and that there is a more complex interplay between delays and compute capacities.

On Blockchain Commit Times: An analysis of how miners choose Bitcoin transactions
In Proceedings of the Second International KDD Workshop on Smart Data for Blockchain and Distributed Ledger (SDBD'20). August, 2020.

Abstract: Blockchains suffer from a well-known, non-trivial scalability problem: The low throughput (i.e., transactions committed per unit time) of blockchains when paired with the increasingly high volume of issued transactions leads to significant delays in transaction commit times. In a month-long investigation of Bitcoin, we reveal that congestion (i.e., when there exist more transactions than can be included in a block) is typical and that commit times exhibit a significant variance during periods of congestion. Although the fee-per-byte dequeuing policy is widely considered the "norm" for prioritizing transactions — and explaining how and when transactions are committed — we show that miners somehow delay a significant fraction of transactions. Such deviations undermine the utility of blockchains for ensuring a "fair" ordering that might be required for some applications.

(Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures
In Proceedings of the 28th Web Conference (WWW'19). San Francisco, USA. May, 2019.

Abstract: WhatsApp has revolutionized the way people communicate and interact. It is not only cheaper than the traditional Short Message Service (SMS) communication but it also brings a new form of mobile communication: the group chats. Such groups are great forums for collective discussions on a variety of topics. In particular, in events of great social mobilization, such as strikes and electoral campaigns, WhatsApp group chats are very attractive as they facilitate information exchange among interested people. Yet, recent events have raised concerns about the spreading of misinformation in WhatsApp. In this work, we analyze information dissemination within WhatsApp, focusing on publicly accessible political-oriented groups, collecting all shared messages during major social events in Brazil: a national truck drivers' strike and the Brazilian presidential campaign. We analyze the types of content shared within such groups as well as the network structures that emerge from user interactions within and cross-groups. We then deepen our analysis by identifying the presence of misinformation among the shared images using labels provided by journalists and by a proposed automatic procedure based on Google searches. We identify the most important sources of the fake images and analyze how they propagate across WhatsApp groups and from/to other Web platforms.

WhatsApp Monitor: A Fact-Checking System for WhatsApp
In Proceedings of the 13th International AAAI Conference on Web and Social Media (ICWSM’19). Munich, Germany. June, 2019.

Abstract: WhatsApp is the most popular communication application in many developing countries such as Brazil, India, and Mexico, where many people use it as an interface to the web. Due to its encrypted and peer-to-peer nature feature, it is hard for researchers to study which content people share through WhatsApp at scale. In this demo paper, we propose WhatsApp Monitor (http://www.whatsapp-monitor.dcc.ufmg.br), a web-based system that helps researchers and journalists explore the nature of content shared on WhatsApp public groups from two different contexts: Brazil and India. Our tool monitors multiple content categories such as images, videos, audio, and textual messages posted on a set of WhatsApp groups and displays the most shared content per day. Our tool has been used for monitoring content during the 2018 Brazilian general election and was one of the major sources for estimating the spread of misinformation and helping fact-checking efforts.

Search Bias Quantification: Investigating Political Bias in Social Media and Web Search
Information Retrieval Journal. Springer. Volume 22, Issue 1-2, April 2019.

Abstract: Users frequently use search systems on the Web as well as online social media to learn about ongoing events and public opinion on personalities. Prior studies have shown that the top-ranked results returned by these search engines can shape user opinion about the topic (e.g., event or person) being searched. In case of polarizing topics like politics, where multiple competing perspectives exist, the political bias in the top search results can play a significant role in shaping public opinion towards (or away from) certain perspectives. Given the considerable impact that search bias can have on the user, we propose a generalizable search bias quantification framework that not only measures the political bias in ranked list output by the search system but also decouples the bias introduced by the different sources — input data and ranking system. We apply our framework to study the political bias in searches related to 2016 US Presidential primaries in Twitter social media search and find that both input data and ranking system matter in determining the final search output bias seen by the users. And finally, we use the framework to compare the relative bias for two popular search systems — Twitter social media search and Google web search — for queries related to politicians and political events. We end by discussing some potential solutions to signal the bias in the search results to make the users more aware of them.

On Microtargeting Socially Divisive Ads: A Case Study of Russia-Linked Ad Campaigns on Facebook
In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*'19), Atlanta, Georgia. January 2019.

Abstract: Targeted advertising is meant to improve the efficiency of matching advertisers to their customers. However, targeted advertising can also be abused by malicious advertisers to efficiently reach people susceptible to false stories, stoke grievances, and incite social conflict. Since targeted ads are not seen by non-targeted and non-vulnerable people, malicious ads are likely to go unreported and their effects undetected. This work examines a specific case of malicious advertising, exploring the extent to which political ads from the Russian Intelligence Research Agency (IRA) run prior to 2016 U.S. elections exploited Facebook's targeted advertising infrastructure to efficiently target ads on divisive or polarizing topics (e.g., immigration, race-based policing) at vulnerable sub-populations. In particular, we do the following: (a) We conduct U.S. census-representative surveys to characterize how users with different political ideologies report, approve, and perceive truth in the content of the IRA ads. Our surveys show that many ads are "divisive": they elicit very different reactions from people belonging to different socially salient groups. (b) We characterize how these divisive ads are targeted to sub-populations that feel particularly aggrieved by the status quo. Our findings support existing calls for greater transparency of content and targeting of political ads. (c) We particularly focus on how the Facebook ad API facilitates such targeting. We show how the enormous amount of personal data Facebook aggregates about users and makes available to advertisers enables such malicious targeting.

White, Man, and Highly Followed: Gender and Race Inequalities in Twitter
In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI'17). Leipzig, Germany. August 2017.

Abstract: Social media is considered a democratic space in which people connect and interact with each other regardless of their gender, race, or any other demographic factor. Despite numerous efforts that explore demographic factors in social media, it is still unclear whether social media perpetuates old inequalities from the offline world. In this paper, we attempt to identify gender and race of Twitter users located in U.S. using advanced image processing algorithms from Face++. Then, we investigate how different demographic groups (i.e. male/female, Asian/Black/White) connect with other. We quantify to what extent one group follow and interact with each other and the extent to which these connections and interactions reflect in inequalities in Twitter. Our analysis shows that users identified as White and male tend to attain higher positions in Twitter, in terms of the number of followers and number of times in user's lists. We hope our effort can stimulate the development of new theories of demographic information in the online space.

Demographics of News Sharing in the U.S. Twittersphere
In Proceedings of the 28th ACM Conference on Hypertext and Social Media (HT'17). Prague, Czech Republic. July 2017.

Abstract: The widespread adoption and dissemination of online news through social media systems have been revolutionizing many segments of our society and ultimately our daily lives. In these systems, users can play a central role as they share content to their friends. Despite that, little is known about news spreaders in social media. In this paper, we provide the first of its kind in-depth characterization of news spreaders in social media. In particular, we investigate their demographics, what kind of content they share, and the audience they reach. Among our main findings, we show that males and white users tend to be more active in terms of sharing news, biasing the news audience to the interests of these demographic groups. Our results also quantify differences in interests of news sharing across demographics, which has implications for personalized news digests.

Linguistic Diversities of Demographic Groups in Twitter
In Proceedings of the 28th ACM Conference on Hypertext and Social Media (HT'17). Prague, Czech Republic. July 2017.

Abstract: The massive popularity of online social media provides a unique opportunity for researchers to study the linguistic characteristics and patterns of user's interactions. In this paper, we provide an in-depth characterization of language usage across demographic groups in Twitter. In particular, we extract the gender and race of Twitter users located in the U.S. using advanced image processing algorithms from Face++. Then, we investigate how demographic groups (i.e. male/female, Asian/Black/White) differ in terms of linguistic styles and also their interests. We extract linguistic features from 6 categories (affective attributes, cognitive attributes, lexical density and awareness, temporal references, social and personal concerns, and interpersonal focus), in order to identify the similarities and differences in particular writing set of attributes. In addition, we extract the absolute ranking difference of top phrases between demographic groups. As a dimension of diversity, we also use the topics of interest that we retrieve from each user. Our analysis unveils clear differences in the writing styles (and the topics of interest) of different demographic groups, with variation seen across both gender and race lines. We hope our effort can stimulate the development of new studies related to demographic information in the online space.

Who Makes Trends? Understanding Demographic Biases in Crowdsourced Recommendations
In Proceedings of the Int'l AAAI Conference on Web and Social (ICWSM’17). Montreal, Canada. May 2017.

Abstract: Users of social media sites like Facebook and Twitter rely on crowdsourced content recommendation systems (e.g., Trending Topics) to retrieve important and useful information. Contents selected for recommendation indirectly give the initial users who promoted (by liking or posting) the content an opportunity to propagate their messages to a wider audience. Hence, it is important to understand the demographics of people who make a content worthy of recommendation, and explore whether they are representative of the media site's overall population. In this work, using extensive data collected from Twitter, we make the first attempt to quantify and explore the demographic biases in the crowdsourced recommendations. Our analysis, focusing on the selection of trending topics, finds that a large fraction of trends are promoted by crowds whose demographics are significantly different from the overall Twitter population. More worryingly, we find that certain demographic groups are systematically under-represented among the promoters of the trending topics. To make the demographic biases in Twitter trends more transparent, we developed and deployed a Web-based service "Who-Makes-Trends" at http://twitter-app.mpi-sws.org/who-makes-trends.

Managing Longitudinal Exposure of Socially Shared Data on the Twitter Social Media
International Journal of Advances in Engineering Sciences and Applied Mathematics (Special Issue on Data Sciences), Springer, 2017.

Abstract: On most online social media sites today, user-generated data remains accessible to allowed viewers unless and until the data owner changes her privacy preferences. In this paper, we present a large-scale measurement study focused on understanding how users control the longitudinal exposure of their publicly shared data on social media sites. Our study, using data from Twitter, finds that a significant fraction of users withdraw a surprisingly large percentage of old publicly shared data—more than 28% of 6-year old public posts (tweets) on Twitter are not accessible today. The inaccessible tweets are either selectively deleted by users or withdrawn by users when they delete or make their accounts private. We also found a significant problem with the current exposure control mechanisms—even when a user deletes her tweets or her account, the current mechanisms leave traces of residual activity, i.e., tweets from other users sent as replies to those deleted tweets or accounts still remain accessible. We show that using this residual information one can recover significant information about the deleted tweets or even characteristics of the deleted accounts. To the best of our knowledge, we are the first to study the information leakage resulting from residual activities of deleted tweets and accounts. Finally, we propose two exposure control mechanisms that eliminates information leakage via residual activities. One of our mechanisms optimize for allowing meaningful social interactions with user posts and another mechanism aims to control longitudinal exposure via anonymization . We discuss the merits and drawbacks of our proposed mechanisms compared to existing mechanisms.

An Evaluation of Sentiment Analysis for Mobile Devices
In Springer Nature Social Network Analysis and Mining. Volume 7, Issue 1, 2017.

Abstract: Sentiment Analysis has become a key tool to extract knowledge from data containing opinions and sentiments, particularly, data from online social systems. With the increasing use of smartphones to access social media platforms, a new wave of applications that explore sentiment analysis in the mobile environment is beginning to emerge. However, there are various existing sentiment analysis methods and it is unclear which of them are deployable in the mobile environment. In this paper, we provide the first of a kind study in which we compare the performance of 14 sentence-level sentiment analysis methods in the mobile environment. To do that, we adapted these methods to run on Android OS and then we measure their performance in terms of memory, CPU, and battery consumption. Our findings unveil methods that require almost no adaptations and run relatively fast as well as methods that could not be deployed due to excessive use of memory. We hope our effort provides a guide to developers and researchers interested in exploring sentiment analysis as part of a mobile application and can help new applications to be executed without the dependency of a server-side API. We also share the Android API that implements all the 14 sentiment analysis used in this paper.

Longitudinal Privacy Management in Social Media: The Need for Better Controls
IEEE Internet Computing (Special Issue on Usable Privacy & Security). Volume 21, Issue 3, May-June, 2017.

Abstract: This large-scale measurement study of Twitter focuses on understanding how users control the longitudinal exposure of their publicly shared social data — that is, their tweets — and the limitations of currently used control mechanisms. Our study finds that, while Twitter users widely employ longitudinal exposure control mechanisms, they face two fundamental problems. First, even when users delete their data or account, the current mechanisms leave signficant traces of residual activity. Second, these mechanisms single out withdrawn tweets or accounts, attracting undesirable attention to them. To address both problems, an inactivity- based withdrawal scheme for improved longitudinal exposure control is explored.

Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media
In Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW'17). Portland, Oregon, USA, February 2017.

Abstract: To help their users to discover the most interesting contents at a particular time, social media sites like Facebook and Twitter deploy content recommendation systems (such as Trending Topics), which often rely on crowdsourced popularity signals to select the contents. Once the contents are selected for recommendation, they reach a large population, effectively giving the initial users of the contents an opportunity to propagate their messages to the wider public. Hence, it is extremely important to understand the demographics of people who make a content worthy of recommendation, and explore whether there are demographic biases in the recommended contents where the majority of the recommended contents were initially popular with crowds exhibiting skewed demographic distributions.
In this work, using extensive data collected from Twitter, we make the first attempt to quantify and explore the demographic biases in the crowdsourced recommendations (particularly, in the selection of trending topics). In our analysis, we find that very different topics are popular among different demographic groups, and in practice, there is a bias towards a particular demographic while selecting the trending topics. We further propose and evaluate different techniques to limit such demographic biases in trending topic selection.

From Migration Corridors to Clusters: The Value of Google+ Data for Migration Studies
In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). San Francisco, USA. August 2016.

Abstract: Recently, there have been considerable efforts to use online data to investigate international migration. These efforts show that Web data are valuable for estimating migration rates and are relatively easy to obtain. However, existing studies have only investigated flows of people along migration corridors, i.e. between pairs of countries. In our work, we use data about "places lived" from millions of Google+ users in order to study migration "clusters", i.e. groups of countries in which individuals have lived. For the first time, we consider information about more than two countries people have lived in. We argue that these data are very valuable because this type of information is not available in traditional demographic sources which record country-to-country migration flows independent of each other. We show that migration clusters of country triads cannot be identified using information about bilateral flows alone. To demonstrate the additional insights that can be gained by using data about migration clusters, we first develop a model that tries to predict the prevalence of a given triad using only data about its constituent pairs. We then inspect the groups of three countries which are more or less prominent, compared to what we would expect based on bilateral flows alone. Next, we identify a set of features such as a shared language or colonial ties that explain which triple of country pairs are more or less likely to be clustered when looking at country triples. Then we select and contrast a few cases of clusters that provide some qualitative information about what our data set shows. The type of data that we use is potentially available for a number of social media services. We hope that this first study about migration clusters will stimulate the use of Web data for the development of new theories of international migration that could not be tested appropriately before.

Towards Sentiment Analysis for Mobile Devices
In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). San Francisco, USA. August 2016.

Abstract: The increasing use of smartphones to access social media platforms opens a new wave of applications that explore sentiment analysis in the mobile environment. However, there are various existing sentiment analysis methods and it is unclear which of them are deployable in the mobile environment. This paper provides the first of a kind study in which we compare the performance of 17 sentence-level sentiment analysis methods in the mobile environment. To do that, we adapted these sentence-level methods to run on Android OS and then we measure their performance in terms of memory usage, CPU usage, and battery consumption. Our findings unveil sentence-level methods that require almost no adaptations and run relatively fast as well as methods that could not be deployed due to excessive use of memory. We hope our effort provides a guide to developers and researchers interested in exploring sentiment analysis as part of a mobile application and can help new applications to be executed without the dependency of a server-side API.

Forgetting in Social Media: Understanding and Controlling Longitudinal Exposure of Socially Shared Data
In Proceedings of the 12th Symposium on Usable Privacy and Security (SOUPS'16), Denver, CO, USA, June 2016

Abstract: On most online social media sites today, user-generated data remains accessible to allowed viewers unless and until the data owner changes her privacy preferences. In this paper, we present a large-scale measurement study focussed on understanding how users control the longitudinal exposure of their publicly shared data on social media sites. Our study, using data from Twitter, finds that a significant fraction of users withdraw a surprisingly large percentage of old publicly shared data -- more than 28% of six-year old public posts (tweets) on Twitter are not accessible today. The inaccessible tweets are either selectively deleted by users or withdrawn by users when they delete or make their accounts private. We also found a significant problem with the current exposure control mechanisms – even when a user deletes her tweets or her account, the current mechanisms leave traces of residual activity, i.e., tweets from other users sent as replies to those deleted tweets or accounts still remain accessible. We show that using this residual information one can recover significant information about the deleted tweets or even characteristics of the deleted accounts. To the best of our knowledge, we are the first to study th information leakage resulting from residual activities of deleted tweets and accounts. Finally, we propose an exposure control mechanism that eliminates information leakage via residual activities, while still allowing meaningful social interactions with user posts. We discuss its merits and drawbacks compared to existing mechanisms.

You followed my bot! Transforming robots into influential users in Twitter
First Monday. Volume 18, Issue 7, July, 2013.

Abstract: Systems like Klout and Twitalyzer were developed as an attempt to measure the influence of users within social networks. Although the algorithms used by these systems are not public known, they have been widely used to rank users according to their influence, especially in the Twitter social network. As media companies might base their viral marketing campaigns on influence scores, users might attempt to boost their influence scores with simple mechanisms like following unknown users to be followed back or even interacting with those who reciprocate these actions. In this paper, we investigate if widely used influence scores are vulnerable and easy to manipulate. Our approach consists of developing Twitter bot accounts able to interact with real users to verify strategies that can increase their influence scores according to different systems. Our results show that it is possible to become influential using very simple strategies, suggesting that these systems should review their influence score algorithms to avoid accounting with automatic activity.

Brazilian Venues

Indo além da primeira camada: Modelagem e Avaliação de Desempenho de ZK-Rollups na plataforma Ethereum
In Proceedings of the 43rd Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos (SBRC 2025). Natal, Brazil. May, 2025.

Abstract
PDF

Abstract: Embora a transição da plataforma Ethereum para Proof-of-Stake e o surgimento de sidechains ofereçam soluções parciais para os problemas de escalabilidade, essas abordagens apresentam trade-offs entre segurança e complexidade de implementação. Para mitigar esses desafios, os ZK-Rollups surgiram como soluções de escalabilidade de Layer-2, combinando computação off-chain com verificação on-chain, garantindo segurança e descentralização na plataforma Ethereum. Este artigo propõe uma abordagem baseada em Redes de Petri Estocásticas para avaliar a viabilidade dos ZK-Rollups, considerando os principais fatores que impactam métricas de desempenho essenciais, como vazão e latência. Também analisamos a relação entre custo e benefício, incluindo o custo médio por transação e como este é impactado pelas métricas de desempenho. Os resultados mostram que uma maior adoção de transações na Layer-2 pode aumentar a vazão do sistema em até 20%, passando de 85 tps em um ambiente sem Layer-2 para 105 tps quando 90% das transações seguem por esse caminho. Por outro lado, a latência pode sofrer um aumento superior a 100% com a utilização de batches maiores na Layer-2.

A System for Monitoring Public Political Groups in WhatsApp
In Proceedings of the 24th Brazilian Symposium on Multimedia and the Web (Webmedia'18). Salvador, Brazil. October, 2018.

Abstract: In Brazil, 48% of the population use WhatsApp to share and discuss news. Currently, there are serious concerns that this platform can become a fertile ground for groups interested in disseminating misinformation, especially as part of articulated political campaigns. Particularly, WhatsApp provides an important space for users to engage in public conversations that worth attention, the public groups. These groups are suitable for political activism and social movement organization. Additionally, it is reasonable to assume that a malicious misinformation campaign might attempt to maximize the audience of a fake story by sharing it in existing public groups. In this paper, we present a system for gathering, analyzing and visualize public groups in WhatsApp. In addition to describe our methodology, we also provide a brief characterization of the content shared in 127 Brazilian groups. We hope our system can help journalists and researchers to understand the repercussion of events related to the Brazilian elections within these groups.

Brazil Around the World: Characterizing and Detecting Brazilian Emigrants Using Google+
In Proceedings of 21st Brazilian Symposium on Multimedia and the Web (WebMedia'15). Manaus, Brazil. October, 2015.

Abstract: Currently available data about people whose left their home country to live in a foreign country does not adequately capture the standards of contemporary global migration flows. A new trend for migration studies is to study the data from the Internet, either by Social Networks or other data in the WEB. In this study, we collected users data from the social network Google+ to investigate which features of Brazilian users are relevant to classify them as a possible emigrant. Our study uses machine learning techniques, SVM. We selected some features to compose our dataset. Our results show that the network features were the ones that had greater capacity for discrimination. The most relevant for the prediction of Brazilian emigrants users are, in order: reciprocity, PageRank, in-degree, clustering coefficient and ratio of incoming foreigners.

Algoritmos de Aprendizado de Máquina para Predição de Resultados das Lutas de MMA
In Proceedings of the 30th Brazilian Symposium on Databases (SBBD'15). Petrópolis, Brazil. October, 2015.

Abstract
PDF

Abstract: This paper proposes using machine learning algorithms to predict the outcome of an MMA fight based on the characteristics of the two fighters and their recent opponents. Our experimental evaluation shows an approach to create a dataset applicable to individual sports and one of the evaluated algorithms has 67% of successful predictions.

Bazinga! Caracterizando e Detectando Sarcasmo e Ironia no Twitter
In Proceedings of the Proceedings of the Brazilian Workshop on Social Network Analysis and Mining (BraSNAM). Recife, Brazil. July, 2015.

Abstract: Sarcasm and irony are widely used forms of speech used inside and outside the Web, having the power to transform a sentence regarding its polarity or sense. The ability of characterizing and detecting sarcastic and ironic messages on data collected from Web could improve many decision-making systems based on Natural Language Processing (NLP) such as the sentiment analysis, text summarization and review ranking systems. In this work, we propose some approaches to the task of characterization and detection of sarcasm and irony in messages posted on Twitter online social network. Using an automatically collected dataset with the hashtags “#sarcasm” and“#irony”, and by exploiting a large set of characterization and classification techniques, our results show satisfactory rates of accuracy and Macro-F1.

Bots Sociais: Como robôs podem se tornar pessoas influentes no Twitter?
In the Revista Eletrônica de Sistemas de Informação (RESI), v. 14, n. 2, mai-ago 2015, artigo 4.

Sigam-me os bons! Transformando robôs em pessoas influentes no Twitter
In Proceedings of the Proceedings of the Brazilian Workshop on Social Network Analysis and Mining (BraSNAM). Curiba, Brazil. July, 2012.

Abstract: Systems that classify influential users in social networks has been used with great frequency, being referenced in scientific papers and the media as the ideal standard for evaluation of influence in the social network Twitter. We consider this measure a complex and subjective and therefore suspect vulnerability and ease of handling these systems. Based on this, we performed experiments and analyzes in two ranking systems of influence: Klout and Twitalyzer. We create simple robots capable of interacting through Twitter accounts and measure their influence. Our results show that it is possible to be influential through simple strategies. This suggests that the systems do not have ideal metric to rank influence.

Thesis and technical reports

On Fairness Concerns in the Blockchain Ecosystem
PhD Thesis. Max Planck Institute for Software Systems (MPI-SWS) & Saarland University (UdS). 2023.

Abstract: Blockchains revolutionized centralized sectors like banking and finance by promoting decentralization and transparency. In a blockchain, information is transmitted through transactions issued by participants or applications. Miners crucially select, order, and validate pending transactions for block inclusion, prioritizing those with higher incentives or fees. The order in which transactions are included can impact the blockchain final state.

Moreover, applications running on top of a blockchain often rely on governance protocols to decentralize the decision-making power to make changes to their core functionality. These changes can affect how participants interact with these applications. Since one token equals one vote, participants holding multiple tokens have a higher voting power to support or reject the proposed changes. The extent to which this voting power is distributed is questionable and if highly concentrated among a few holders can lead to governance attacks.

In this thesis, we audit the Bitcoin and Ethereum blockchains to investigate the norms followed by miners in determining the transaction prioritization. We also audit decentralized governance protocols such as Compound to evaluate whether the voting power is fairly distributed among the participants. Our findings have significant implications for future developments of blockchains and decentralized applications.

Characterizing Interconnections And Linguistic Patterns In Twitter
Master Thesis. Computer Science Department. Universidade Federal de Minas Gerais (UFMG). 2017.

Abstract: Social media is considered a democratic space in which people connect and interact with each other regardless of their gender, race, or any other demographic aspect. Despite numerous efforts that explore demographic aspects in social media, it is still unclear whether social media perpetuates old inequalities from the offline world. In this dissertation, we attempt to identify gender and race of Twitter users located in the United States using advanced image processing algorithms from Face++. We investigate how different demographic groups (i.e. male/female, asian/black/white) connect with each other and differentiate them regarding linguistic styles and also their interests. We quantify to what extent one group follows and interacts with each other and the extent to which these connections and interactions reflect in inequalities in Twitter. We also extract linguistic features from six categories (affective attributes, cognitive attributes, lexical density and awareness, temporal references, social and personal concerns, and interpersonal focus) in order to identify the similarities and the differences in the messages they share in Twitter. Furthermore, we extract the absolute ranking difference of top phrases between demographic groups. As a dimension of diversity, we also use the topics of interest that we retrieve from each user. Our analysis shows that users identified as white and male tend to attain higher positions, in terms of the number of followers and number of times in another user's lists, in Twitter. There are clear differences in the way of writing across different demographic groups in both gender and race domains as well as in the topic of interest. We hope our effort can stimulate the development of new theories of demographic information in the online space. Finally, we developed a Web-based system that leverages the demographic aspects of users to provide transparency to the Twitter trending topics system.

Framework para Sistemas de Navegação de Veículos Aéreos não Tripulados
Bachelor Thesis. Computer Science Department. Universidade Federal de Ouro Preto (UFOP). 2014.

Abstract
PDF

Abstract: Autonomous unmanned flights undoubtedly enable new opportunities for scientific development. The drones can be used in military services, for example, in combat or as well as for rescue missions, aerial survey, supervision and inspection of a territory, attracting significant attention from media outlets such as, for example, television stations, radio, newspapers and internet. The goal of this project is whether it is possible to make viable autonomous flights at AR.Drone 2.0 and the understanding of its operation. This will require the implementation of a control program for autonomous flights. This framework requires the acquisition of data during the flight, which are obtained using sensors which use Arduino. The Arduino communication with the drone is needed for the inclusion of new sensors and the use of the AR.Drone is performed by the framework Node.js. Each remote button has a specific command, and may be in order for the user to create own missions or even perform some missions previously implemented by the developer. All tests were run on the AR.Drone 2.0, using the Node.js framework, sensors and a remote control. Through the experiments and presented studies became possible to achieve the proposed objective, making possible the implementation of autonomous flights in drone. As a result, for the realization of autonomous flight we designed a framework where the user can create autonomous flight missions for the drone run them. These commands are sent to the drone by the user due to use of a remote control. This remote control sends data to a sensor connected to the Arduino that processes the data and then is read and interpreted by the drone.

Press Coverage

My work has been published at top venues in computer science, including IMC, WWW, ICWSM, CSCW, and more. Some of my scientific efforts have also been covered by the news media and specialized blogs, including The New York Times, The Huffington Post, MIT Tech Review, BBC Brasil, and Folha de São Paulo.

Projeto Eleições sem Fake

Gender and Race Inequalities in Twitter - paper on Web Intelligence'17

MIT Technology Review: Twitter's Glass Ceiling Revealed for Women and Minority Races.
Nexo Jornal: Como o Twitter reproduz desigualdades de gênero e raça do mundo offline.

Making a bot influential in Twitter - paper on First Monday'13

The New York Times: I Flirt and Tweet. Follow Me at #Socialbot.
Huffington Post Tech: Twitter Bots Have No Trouble Fooling You, Getting More Influence Than Oprah
Huffington Post Business: Twitter Bots Have No Trouble Fooling You, Getting More Influence Than Oprah
San Francisco Chronicle: How do you know I'm not a bot?
GizmodoBR: Como um bot brasileiro se tornou uma pessoa influente no Twitter
AOL Tech News: Twitter bots gain real influence, Analysts Find
G1 – Instante posterior. Feitos um para o outro
Folha de São Paulo. Robô Social
Dayly Dot: The new generation of Twitter bots are disturbingly human
TheWeek: Robot 'pals' are invading social media — and it's time to unfriend them

Awards

[2025] - I have been invited, on behalf of the Alexander von Humboldt Foundation, to represent Germany at the Brazilian-German BRAGFOST Symposium.
[2024] - I have been invited, on behalf of the President of the Alexander von Humboldt Foundation, to represent Germany at the Indo-German INDOGFOE Symposium. The event brought together outstanding Indian and German scientists.
[2024] - The paper Liquid Staking Tokens in Automated Market Makers received the best paper award from MARBLE 2024.
[2024] - Successfully completed Ph.D. with Magna Cum Laude distinction, demonstrating exceptional academic excellence and rigorous research capabilities.
[2023] - My seminar on Blockchains and Decentralized Finance (DeFi) has been nominated for the “Busy Beaver” award – an award for outstanding and excellent lectures at the Saarland University.
[2022] - CISPA Summer School on Trustworthy Artificial Intelligence 2022: Part of the approx. 100 students selected worldwide.
[2019] - The Swiss Blockchain Winter School 2019: Received a stipend award provided to selected students to participate in the event.
[2019] - Our Kunumi’s medical auditor project was awarded as Brazil’s most innovative health software in 2019 according to IT Forum 365, promoted by PwC and ITMidia.
[2015] - Research Grant during my M.Sc. by the Brazilian National Council for Scientific and Technological Development (CNPq) and by the Coordination for the Improvement of Higher Education Personnel CAPES).
[2014] - Motion of applauses for developing the SmartHome project, during the exchange program Science Without Borders in Budapest - Hungary - Câmara de Mariana/MG Brazil - (November/2014)
[2013] - Granted a scholarship in 2013 for academic excellence to study in a European university for 14 months.
[2013] - Best paper nominee: CTIC '13, BraSNAM’12
[2013] - Ranked 3rd in the Brazilian Young Research Scientist Competition (XXXII Concurso de Trabalhos de Iniciação Científica - CTIC 2013), XXXIII Congresso da Sociedade Brasileira de Computação (CSBC 2013)
[2012] - Honorable Mention Article Sigam-me os Bons! Transformando robôs em pessoas influentes no Twitter, Brazilian Workshop on Social Network Analysis and Mining (BraSNAM’12)
[2010] - Research Grant during my B.Sc. by the Brazilian National Council for Scientific and Technological Development (CNPq).

Talks and Interviews

[2025] - Fairness in Token Delegation: Mitigating Voting Power Concentration in Decentralized Autonomous Organizations. 3rd Edition of the TUM Blockchain&Cybersecurity Salon, Munich, Germany.
[2025] - The Writing is on the Wall: Analyzing the Boom of Inscriptions and its Impact on EVM-compatible Blockchains. CAAW, Miyakojima, Japan.
[2025] - A Public Dataset for the ZKsync Rollup. CAAW, Miyakojima, Japan.
[2025] - Fairness Concerns in the Blockchain Ecosystem at the Max Planck Institute for Security and Privacy (MPI-SP), Bochum, Germany.
[2025] - Web3 Is Broken? Airdrops, UX & Ethics Explained by Research Scientist. Decentralized Voices.
[2025] - The Writing is on the Wall: Analyzing the Boom of Inscriptions and its Impact on EVM Blockchains. Ethereum Zürich. Zürich, Switzerland.
[2024] - Fairness Concerns in the Blockchain Ecosystem at the INDOGFOE, Indo-German Frontiers of Engineering Symposium by Alexander von Humboldt Foundation., Mumbai, India.
[2024] - Blockchain Research: Where society becomes decentralized at the Max Planck Institute for Software Systems (MPI-SWS), Kaiserslautern, Germany.
[2024] - Airdrops: Giving Money Away Is Harder Than It Seems. EthCC. Brussels, Belgium.
[2024] - Understanding Blockchain Governance. New York City, USA.
[2023] - Dissecting Bitcoin and Ethereum Transactions: On the Lack of Transaction Contention and Prioritization Transparency in Blockchains.Financial Cryptography and Data Security (FC 2023). Bol, Brač, Croatia.
[2021] - Selfish & Opaque Transaction Ordering in the Bitcoin Blockchain: The Case for Chain Neutrality. ACM SIGCOMM Internet Measurement Conference (IMC 2021). Virtual Event.
[2020] - On Blockchain Commit Times: An analysis of how miners choose Bitcoin transactions. Second International KDD Workshop on Smart Data for Blockchain and Distributed Ledger (SDBD'20).
[2020] - Countering Misinformation on Social Media Platforms. ThoughtWorks. Belo Horizonte, BR.
[2020] - Countering Misinformation on Social Media Platforms. SMART Data Sprint 2020. Lisbon, PT.
[2019] - (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures. 5th International Conference on Computational Social Science (IC2S2’19). Amsterdam, NL.
[2016] - From Migration Corridors to Clusters: The Value of Google+ Data for Migration Studies. IEEE/ACM Inter- national Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). San Francisco, US.
[2016] - Towards Sentiment Analysis for Mobile Devices. IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). San Francisco, US.
[2015] - Bazinga! Caracterizando e Detectando Sarcasmo e Ironia no Twitter. Brazilian Workshop on Social Network Analysis and Mining (BraSNAM’15). Recife, BR.
[2015] - Brazil Around the World: Characterizing and Detecting Brazilian Emigrants Using Google+. Brazilian Sym- posium on Multimedia and the Web (WebMedia’15). Manaus, BR.
[2013] - Bots Sociais: Como robôs podem se tornar pessoas influentes no Twitter? XXXII Concurso de Trabalhos de Iniciação Científica (CTIC’13). Maceió, BR.
[2012] - Sigam-me os bons! Transformando robôs em pessoas influentes no Twitter. Brazilian Workshop on Social Network Analysis and Mining (BraSNAM’12). Curitiba, BR.
[2011] - UGUIDE: Rede Social Móvel Aplicada a Educação. XIX Seminário de Iniciação Científica da UFOP. Ouro Preto, BR.
[2011] - Computação Móvel: Tendências e Android. 6a Semana da Informática - IFSULDEMINAS. Muzambinho, BR.
[2010] - BlueGuide - Uma Plataforma de Suporte ao Turista em Ouro Preto. I Seminário de Pesquisa do PPGCC & UFOP e I Fórum de Alunos e Ex-Alunos do DECOM. Ouro Preto, BR.

About me

Selected Works and Publications

For a full list of peer-reviewed conference and journal publications, kindly check my Google Scholar or DBLP Profile.

Airdrops: Giving Money Away Is Harder Than It Seems

Understanding Blockchain Governance: Analyzing Decentralized Voting to Amend DeFi Smart Contracts

The Writing is on the Wall: Analyzing the Boom of Inscriptions and its Impact on EVM-compatible Blockchains

A Public Dataset For the ZKsync Rollup

Liquid Staking Tokens in Automated Market Makers

Quantifying Arbitrage in Automated Market Makers: An Empirical Study of Ethereum ZK Rollups

Layer-2 Arbitrage: An Empirical Analysis of Swap Dynamics and Price Disparities on Rollups

Dissecting Bitcoin and Ethereum Transactions: On the Lack of Transaction Contention and Prioritization Transparency in Blockchains

Selfish & Opaque Transaction Ordering in the Bitcoin Blockchain: The Case for Chain Neutrality

Modeling Coordinated vs. P2P Mining: An Analysis of Inefficiency and Inequality in Proof-of-Work Blockchains

On Blockchain Commit Times: An analysis of how miners choose Bitcoin transactions

(Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures

WhatsApp Monitor: A Fact-Checking System for WhatsApp

Search Bias Quantification: Investigating Political Bias in Social Media and Web Search

On Microtargeting Socially Divisive Ads: A Case Study of Russia-Linked Ad Campaigns on Facebook

White, Man, and Highly Followed: Gender and Race Inequalities in Twitter

Demographics of News Sharing in the U.S. Twittersphere

Linguistic Diversities of Demographic Groups in Twitter

Who Makes Trends? Understanding Demographic Biases in Crowdsourced Recommendations

Managing Longitudinal Exposure of Socially Shared Data on the Twitter Social Media

An Evaluation of Sentiment Analysis for Mobile Devices

Longitudinal Privacy Management in Social Media: The Need for Better Controls

Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media

From Migration Corridors to Clusters: The Value of Google+ Data for Migration Studies

Towards Sentiment Analysis for Mobile Devices

Forgetting in Social Media: Understanding and Controlling Longitudinal Exposure of Socially Shared Data

You followed my bot! Transforming robots into influential users in Twitter

Brazilian Venues

Indo além da primeira camada: Modelagem e Avaliação de Desempenho de ZK-Rollups na plataforma Ethereum

A System for Monitoring Public Political Groups in WhatsApp

Brazil Around the World: Characterizing and Detecting Brazilian Emigrants Using Google+

Algoritmos de Aprendizado de Máquina para Predição de Resultados das Lutas de MMA

Bazinga! Caracterizando e Detectando Sarcasmo e Ironia no Twitter

Bots Sociais: Como robôs podem se tornar pessoas influentes no Twitter?

Sigam-me os bons! Transformando robôs em pessoas influentes no Twitter

Thesis and technical reports

On Fairness Concerns in the Blockchain Ecosystem

Characterizing Interconnections And Linguistic Patterns In Twitter

Framework para Sistemas de Navegação de Veículos Aéreos não Tripulados

Press Coverage

Projeto Eleições sem Fake

Gender and Race Inequalities in Twitter - paper on Web Intelligence'17

Making a bot influential in Twitter - paper on First Monday'13

Awards

Talks and Interviews

Interests

Education

Language