∎

¹¹institutetext: J. Patel ²²institutetext: Worcester Polytechnic Institute
²²email: jupatel@wpi.com ³³institutetext: P. Sonar ⁴⁴institutetext: Worcester Polytechnic Institute
⁴⁴email: prajankya@wpi.com ⁵⁵institutetext: C. Pinciroli ⁶⁶institutetext: Worcester Polytechnic Institute
⁶⁶email: cpinciroli@wpi.com

On Multi-Human Multi-Robot Remote Interaction

A Study of Transparency, Inter-Human Communication, and Information Loss in Remote Interaction

Jayam Patel Prajankya Sonar Carlo Pinciroli

(Received: / Accepted: )

Abstract

In this paper, we investigate how to design an effective interface for remote multi-human multi-robot interaction. While significant research exists on interfaces for individual human operators, little research exists for the multi-human case. Yet, this is a critical problem to solve to make complex, large-scale missions achievable in which direct human involvement is impossible or undesirable, and robot swarms act as a semi-autonomous agents. This paper’s contribution is twofold. The first contribution is an exploration of the design space of computer-based interfaces for multi-human multi-robot operations. In particular, we focus on information transparency and on the factors that affect inter-human communication in ideal conditions, i.e., without communication issues. Our second contribution concerns the same problem, but considering increasing degrees of information loss, defined as intermittent reception of data with noticeable gaps between individual receipts. We derived a set of design recommendations based on two user studies involving 48 participants.

Keywords:

Information Transparency Inter-Human Communication Information Loss Remote Interaction Multi-Human Multi-Robot Interaction

1 Introduction

Robot swarms promise solutions for missions in which direct human involvement is either impossible or undesirable, such as search-and-rescue, firefighting, planetary exploration, and ocean restoration murphy2012decade . When robot swarms are deployed to perform complex missions, autonomy is only part of the picture. Along with autonomy, it is equally important for human operators to monitor and affect the behavior of the swarm. This creates the issue of designing effective solutions for remote interaction between humans and robot swarms.

While a significant body of work exists in remote interaction involving single humans and one or more robots, the scenario in which multiple humans interact with a robot swarm has received little attention. In this paper, we argue that it will be common for multiple humans to cooperate in the supervision of robot swarms. First, the amount of information generated by robot swarms is likely to exceed the span of apprehension of any individual operator miller1956magical , even when considering highly skilled ones such as video gamers. Cooperation among human operators would make monitoring more efficient. Second, the involvement of multiple humans allows for improved flexibility in robot control and task assignment, an important advantage in complex operations.

However, the involvement of multiple humans comes with old and new challenges. Among the old, we highlight the need for information transparency, which is the ability of the interface-swarm system to convey useful data for the operators to understand and modify the status of the swarm wohleber_effects_2017 ; chen_situation_2018 ; roundtree_transparency:_2019 ; bhaskara_agent_2020 ; chakraborti_explicability_nodate ; tulli_eects_nodate . Multiple operators also create the new challenge of conveying intentions and actions to other operators, i.e., effective inter-human communication, for better cooperation and conflict mitigation tomasello2010origins . Inter-human communication can be either direct or indirect. Direct communication includes verbal and non-verbal communication (e.g., gestures) holdcroft1976forms . Indirect communication is mediated through the remote interface (e.g., a graphical user interface on a laptop or tablet). Effective indirect communication requires inter-operator transparency, which pushes for interface designs that make it simple for operators far away from each other to exchange information on their intentions and plans breazeal2005effects ; lyons2013being ; chen_situation_2014 ; wohleber_effects_2017 ; roundtree_transparency:_2019 ; bhaskara_agent_2020 .

In this paper, we explore the design space of remote interfaces for multi-human multi-robot interaction. We study the role of direct and indirect communication among operators, and investigate how to achieve high levels of information and inter-operator transparency through several variants of our interface. The result of this work is a set of recommendations on which design elements contribute to making a remote interface effective. This part of our study builds upon previous work Patel2021 in which we investigated transparency and inter-human communication on the performance of human operators in proximal interaction. Proximal interaction occurs when humans and robots share the same environment.

Remote interaction allows us to study another important aspect—the role of information loss. In this paper, we consider information loss as a decrease in the frequency of the visual information presented to the operators. We measure information loss as the time interval, measured in seconds, between the delivery of consecutive video frames (the inverse of frames per second). Packet loss, bandwidth limitations, and geographical distance between the locations of the operators and the robots act as causal factors for information loss. Information loss leads to degraded operator performance, lack of awareness and trust, and increase in cognitive workload ellis2004generalizeability .

The last factor we consider in our study is that, in presence of non-ideal communication, it is also likely that the operators experience heterogeneous levels of information loss, causing a disparity in workload and situational awareness across operators.

The main contributions of this paper can be summarized as follows:

•

We provide an extensive investigation of the design space of remote interfaces for multi-human multi-robot interaction. We consider factors such as direct and indirect communication, information and inter-operator transparency, and homogeneous and heterogeneous information loss.
•

We compile a set of design recommendations validated through a user study that included 48 participants. We implemented a highly configurable remote interface that incorporates these recommendations and enables future studies of this kind.

This paper is organized as follows. We discuss related literature on remote human-robot interaction in Sec. 2. In Sec. 3, we discuss the design of our configurable remote interface. We report the results of our user study in ideal conditions in Sec. 4. We then introduce different types of information loss and report the results of a dedicated user study in Sec. 5. We summarize our contributions and outline directions for future work in Sec. 6.

2 Related Work

Remote robot control and manipulations has been a field of interest since Goertz and Thompson laid the foundation of modern tele-operation goertz1954electronically . The field has mostly focused manipulators hokayem_bilateral_2006 ; lichiardopol_survey_2007 ; varkonyi_survey_2014 ; jung_robotic_2018 ; li_operator_2018 rather than on mobile robots. This body of research has contributed advancements in tele-presence ferreira_immersive_nodate ; klow_privacy_2017 ; dimitoglou_telepresence_2019 ; salichs_privacy_2019 , tele-robotics nak_young_chong_remote_2000 ; rakita_remote_2019 , tele-operation jingtai_liu_competitive_2005 ; ma_teleoperation_2010 ; hutchison_evaluation_2010 ; mansour_dynamic_2012 ; hong_visual_2013 , and tele-surgery patel_long_2019 ; shahzad_telesurgery_2019 ; dardona_remote_2019 . This research has focused on identifying suitable interfaces and improving their usability lager_remote_2019 ; lunghi_multimodal_2019 ; roldan_bringing_2019 ; regenbrecht_intuitive_nodate ; music_humanrobot_2019 ; esfahlani_mixed_2019 ; welburn_mixed_2019 ; jang_omnipotent_2019 , as well as proposing novel control architectures for these interfaces cheung_semi-autonomous_2011 ; do_multiple_2011 ; lee_development_2012 ; lee_semiautonomous_2013 . Chen et al. chen_human_2007 categorize existing research according to the factors that affect remote control of robots. These factors are field of view, system orientation, camera viewpoints, depth perception, degraded video quality, time delay, and camera motion. Building upon this work, Feth et al. dillmann_shared-control_2009 and Kim et al. kim_implementation_2012 ; kim_implementation_2013 present a shared control framework to allow multiple operators to interact with manipulators. Lee et al. dong_gun_lee_human-centered_2013 extend these shared control frameworks to study the impact of information delay on the performance of human operators. In their work, the authors incorporate a passivity-based controller to counteract the negative effects of information delay on operator’s performance. These works are limited to interface design for remote interaction with industrial manipulators, and their findings may not be applicable to remote interface for manipulating numerous mobile robots. To the best of my knowledge, our study is the first study that investigates the impact of transparency and inter-human communication on a multi-human multi-robot interaction.

Loss of information has been recognized as a key factor in the performance and engagement of human operators chen_human_2007 ; mackenzie1993lag ; ellis2004generalizeability ; lane2002effects ; sheridan1963remote ; rastogi1997design ; darken2014spatial ; watson1998effects ; massimino1994teleoperator ; chen2008human . Research suggests that the effect of information loss and the ability to handle the loss may vary according to the tasks and the interface to interact with the system. To overcome the degradation in performance, there are three methods to mitigate the effects of loss on the performance of human operators. These methods are adopting passivity-based control methods lewis_two_2011 ; varkonyi_survey_2014 ; cheung_semi-autonomous_2011 ; jung_robotic_2018 ; hokayem_bilateral_2006 , predictive displays keskinpala2004objective ; baker2004improved ; calhoun200611 ; collett2006developer ; daily2003world ; nielsen2006comparing ; sheridan2002humans ; kheddar2014virtual ; ricks2004ecological and higher-granularity of control patel2019 ; ayanian_controlling_2014 ; kolling_human_2013 . However, these studies are limited to the scenario in which a single operator interacts with one of more robots. Our study furthers this line of research by providing an extensive investigation of the factors that affect the design of remote interfaces for multi-human multi-robot interaction in presence of information loss.

3 System Design

In this section, we present the main features of our remote interface and the behavior of the robots. At its essence, our interface is a web-based client-server architecture. The server runs ARGoS Pinciroli:SI2012 , a fast multi-robot simulator, on a node offered by Amazon Web Services¹¹1https://aws.amazon.com/. The server is implemented as a visualization plugin that accepts multiple connections from the clients. The client side is a web application implemented with Node.js²²2https://nodejs.org/ and WebGL³³3https://get.webgl.org/ which offers similar features with respect to the original graphical visualization of ARGoS. A diagram of the client-server architecture is reported in Fig. 1 and a screenshot of the web interface is shown in Fig. 2. The source code of the system is available online as open source software.⁴⁴4https://github.com/NESTLab/argos3-webviz

Refer to caption — Figure 1: System overview.

The process starts when a user performs a command on the client. The web interface allows the user to operate at multiple levels of granularity. In our previous work patel_mixed-granularity_2019 , we found that mixed granularity of control offers superior usability in complex missions that require both navigation and environment modification. Similarly to patel_mixed-granularity_2019 , in this paper we focus on a collective transport scenario due to the compositional nature that this kind of task presents — collective transport combines navigation, task allocation, and object manipulation. Our interface is therefore designed for this scenario and it mirrors many of the features we presented in patel_mixed-granularity_2019 . It is important to highlight, however, that the remote interface presented here is a completely new artifact based on a different technology: in fact, the work in patel_mixed-granularity_2019 studied proximal interactions with a touch-based interface.

3.1 Collective Transport

We employ a collective transport behavior based on the finite state machine shown in Fig. 3. The behavior is identical to the one discussed in our previous work patel_mixed-granularity_2019 . The states in the finite state machine are as follows:

Reach Object. On receiving the desired goal position for the object, the robots in the transport team navigate and organize themselves around the object in a circular manner. These positions are generated based on the number of robots in the team and their distance from the object. The state comes to an end once all the robots reach their designated positions.

Approach Object. After organizing themselves, the robots move towards the centroid of the object. The state comes to an end once all the robots are touching the object.

Push Object. Once the robots are in contact with the object, the robots rotate in place to face the direction of the goal. The robots start moving at equal speed towards the goal, while maintaining a fixed distance from the centroid of the object. This strategy prevents the robot in front and on the sides from breaking formation. If a robot breaks the formation, the robots switch back to Reach Object, wait for its completion, and subsequently resume their transport operation. The state comes to an end once the object reaches the goal position.

Rotate Object. The robots rearrange themselves around the object and move in a circular path in the outward direction, thereby rotating the object in place. If any robot breaks the formation, the robots rearrange themselves and resume rotating the object. The state comes to an end once the robots achieve the desired rotation.

3.2 User Interface

Object Manipulation. Object manipulation is triggered when an operator selects an object with a left click. The goal position always requires a right click, and the interface overlays the selected object with a transparent bounding box. The operator can also define the goal position for multiple objects. In this case, the robots autonomously distributed across the objects and transport them using the collective transport behavior. If two or more operators manipulate the same object, the interface keeps the position specified by the last operator. Fig. 4(a) shows a selected object overlaid with a bounding box. Fig. 4(b) illustrates how the goal position is visualized. The desired position and orientation of the object is conveyed by the interface as shown in Fig. 4(c) and 4(d).

Robot Manipulation. Robot manipulation starts with an operator selecting a robot with a left click. The goal position is assigned using a right click. The interface overlays the selected robot with a transparent bounding box convey the current selection. The operator can define the goal position for multiple robots at once. If the robot is performing the collective transport behavior during this request, other robots in the collective transport team pause their operation until the selected robot reaches the desired position. In case the robot is a part of an operator-defined team, the selected robot navigates to the newly specified position and other robots continue their respective operations. When two or more operators want to manipulate the same robot, the interface processes the position specified by the last operator. Fig. 5(a) shows a selected robot overlaid with a bounding box to visualize the current selection. Fig. 5(b) shows the goal position determined by the operator and visualized as a colored representation of the selected robot. The color of the goal position matches the color of the fiducial markers to differentiate between the goal positions of different robots. Fig. 5(c) shows the selected robot navigating to the specified goal position.

Robot Team Selection and Manipulation. In addition to manipulating a single robot, the operator can select a team of robots by pressing control key and clicking the left mouse button. The goal position is still assigned with a right click. The interface overlays a transparent bounding box over all the selected robots to identify the current selection. If two or more operators have the same robot in their team, then the common robot navigates to the position specified by the last operator without affecting other robots in other teams. Fig. 6(a) shows a screenshot in which the selected robots are overlaid with a bounding box. Fig. 6(b) shows the goal position visualized as colored virtual objects, one for each of the selected robots. The color of the virtual objects matches the color of the fiducial markers on the body of the robots. Fig. 6(c) shows the robots navigating to the goal position.

3.3 Transparency Modes

To investigate the role of various elements of the user interface, we endowed our client with the possibility to provide information to the user in several modalities. The main insight in our work is to consider the natural field of view of the human eye (see Fig. 7). We implemented our client to allow for both central transparency, i.e., displaying elements in the center of the screen or directly above robots and objects (green region in Fig. 7); and peripheral transparency, i.e., relegating interface elements to the borders of the screen (yellow region in Fig. 7). The key difference between central and peripheral transparency is the type and quantity of information displayed. With central transparency, the information is contextual and limited to the robots effectively visible on the screen (which changes as the operator modifies the camera pose). Peripheral transparency, on the other hand, always displays summary information on all the robots and the progress of each task.

The interface can be configured to show or hide every element. For the purposes of our work, we identified four essential “transparency modes”:

•

No Transparency (NT). The interface hides all the information originated by the robots or other operators. The operator can still interact with robots and objects using all the control modalities.
•

Central Transparency (CT). The interface overlays a direction pointer and text to indicate the heading and current task of each robot (as shown in Fig. 8). The color of the pointer resembles the color of the fiducial markers on each robot to differentiate between multiple pointers. The robot status displays the current operation executed by the robot corresponding to the states of the collective transport finite state machine (see Fig. LABEL:fig-loi:statemachine). Additionally, the interface indicates the commands of other operators, to foster shared awareness across operators. This information is available only for entities in the operator’s field-of-view. The operator can move around in the environment to view information of other robots and objects that are not in the current field-of-view.
•

Peripheral Transparency (PT). The interface offers a robot panel, an object panel, and a log window containing global information on the system and its constituents (see Fig. 9). The robot panel contains one icon for each robot. The panel highlights the icon corresponding to the robots that are moving or performing operator-defined actions. The panel also displays a warning, through a blinking exclamation point, to notify the operators of any fault conditions. These include getting stuck due to an obstacle, and software or hardware failures. The object panel shows all the objects in the environment. The interface highlights the objects currently manipulated by the robots. The panel also provides a functionality to select an object by clicking on the lock icon. An operator can convey their intention of manipulating an object by selecting the lock in the object panel. The interface highlights the lock with a blue icon to signify own selection and a red icon to indicate the selection of another operator. An operator can lock only one object at a time and cannot overwrite the selection of other operators.
•

Mixed Transparency (MT). The interface also allows one to enable both central and peripheral transparency. In this case, the displayed information is a combination of the two transparency modes.

3.4 Communication Modes

Analogously to transparency modes, the interface also defines different modes for inter-human communication. We classify inter-human communication into direct, indirect, and a combination of both. The communication modes are described as follows.

•

No Communication (NC). In this mode, the operators are completely unable to communicate with each other. The interface hides all the information originating from other operators, such as which robots are being used and which objects are being manipulated.
•

Direct Communication (DC). In this mode, the operators can communicate verbally while performing the task. We established a verbal communication channel using Zoom⁵⁵5www.zoom.us, a video-conferencing application. The operators are allowed to ask for help and strategize at will towards the completion of the task.
•

Indirect Communication (IC). In contrast to direct communication, in this mode the operators cannot verbally communicate their intentions and actions, but they can use the presented transparency modes to communicate indirectly. In this paper, the choice of which transparency mode is active was determined by us at experiment time for the purposes of our study. In a realistic setting, however, each operator is allowed to choose the most appropriate mode.
•

Mixed Communication (MC). In this mode, the operators can communicate both directly and indirectly throughout the duration of the experiment.

4 User Study under Ideal Conditions

4.1 Preliminaries

The main purpose of this first set of experiments is to validate the usability of the various transparency ( $T$ ) and communication ( $C$ ) modes under ideal conditions in remote interaction ( $R$ ), i.e., with negligible loss of information. We base the experiments on the following main hypotheses.

Hypotheses on the impact of different transparency modes:

•

H ${}^{R}_{T}$ 1: Mixed transparency (MT) has the best outcome with respect to other modes.
•

H ${}^{R}_{T}$ 2: Operators prefer mixed transparency (MT) over other modes.
•

H ${}^{R}_{T}$ 3: Operators prefer central transparency (CT) over peripheral transparency (PT).

Hypotheses on the impact of different communication modes:

•

H ${}^{R}_{C}$ 1: Mixed communication (MC) has the best outcome with respect to other modes.
•

H ${}^{R}_{C}$ 2: Operators prefer mixed communication (MC) over other modes.
•

H ${}^{R}_{C}$ 3: Operators prefer direct communication (DC) over indirect communication (IC).

Experimental Setup. We designed a game scenario (shown in Fig. 10) where the operators were given 9 robots to transport 6 objects (2 big and 4 small) to a goal region. Big objects were worth 2 points each, and small objects were worth 1 point each. The operators had to work as a team to score as many points as possible, over a maximum of 8, in experiments lasting 8 minutes. The operators could move the big objects using the collective transport behavior, or directly use individual robots or team manipulation commands to push the objects.

Participant Sample. For this user study, we recruited 28 university students. 14 of them (5 female, 9 male), with ages ranging from 19 to 37 years old ( $23.28\pm 4.38$ ), performed the task four times with a different transparency mode (NT, CT, PT and MT) each time. The other 14 participants (4 female, 10 male), with ages ranging from 18 to 48 years old ( $23.64\pm 7.87$ ), performed the task four times with a different communication mode (NC, DC, IC and MC) each time. We chosen the teams and the assignments at random. No participant had prior experience with the remote interface.

Procedures. Each session of the study had two participants and approximately took a total of 105 minutes. After signing the consent form, we explained the task and gave each participant 10 minutes to familiarize with the system. We randomized the order of the tasks and the modalities to reduce the influence of learning effects. After each task, the participants had to answer a subjective questionnaire.

Metrics. We recorded subjective and objective measures for each participant and each task. We used the following common measures:

•

Situational Awareness. We used the Situational Awareness Rating Technique (SART) taylor2017situational on a 4-point Likert scale likert to assess the awareness of the situation after each task.
•

Task Workload. We used the NASA TLX hart1988development scale on a 4-point Likert scale to compare the perceived workload in each task.
•

Trust. We used the trust questionnaire uggirala2004measurement on a 4-point Likert scale to compare the trust in the interface affected by each transparency mode.
•

Quality of Interaction. We used a custom questionnaire on a 5-point Likert scale to assess the team-level and robot-level interaction. The interaction questionnaire is reported in Fig. 11.
•

Performance. We used the points earned for each task as a metric to scale the performance achieved for each transparency mode.
•

Usability. We asked participants to select the features (log, robot panel, object panel, and on-robot status) they used during the study. Additionally, we asked them to rank the transparency modes from 1 to 4, 1 being the highest rank.

- Did you understand your teammate’s intentions? Were you able to understand why your teammate was taking a certain action? - Could you understand your teammate’s actions? Could you understand what your teammate was doing at any particular time? - Could you follow the progress of the task? While performing the tasks, were you able to gauge how much of it was pending? - Did you understand what the robots were doing? At all times, were you sure how and why the robots were behaving the way they did? - Was the information provided by the interface clear to understand?

Figure 11: The subjective questionnaire employed in our user study to assess the quality of interaction of an operator with our interface.

Table 1: Results with relationships between transparency modes. The relationships are based on mean ranks obtained through a Friedman Test. The symbol ^∗ denotes significant difference (

p<0.05

) and the symbol ^∗∗ denotes marginally significant difference (

p<0.10

). The symbol ^- denotes negative scales where lower ranking is better.

SART SUBJECTIVE SCALE
Attributes	Relationship	$\chi^{2}(3)$	$p$ -value
Instability of Situation^-	NT $>$ PT $>$ CT $>$ MT^∗∗	$9.554$	$0.023$
Complexity of Situation^-	NT $>$ PT $>$ CT $>$ MT^∗∗	$16.950$	$0.001$
Variability of Situation^-	not significant	$2.452$	$0.484$
Arousal	MT $>$ CT $>$ PT $>$ NT^∗∗	$8.550$	$0.036$
Concentration of Attention	MT $>$ CT $>$ PT $>$ NT^∗∗	$11.898$	$0.008$
Spare Mental Capacity	not significant	$2.209$	$0.530$
Information Quantity	MT $>$ CT $>$ PT $>$ NT^∗∗	$12.288$	$0.006$
Information Quality	MT $>$ CT $>$ PT $>$ NT^∗∗	$28.758$	$<0.001$
Familiarity with Situation	CT $>$ MT $>$ PT $>$ NT^∗	$6.276$	$0.099$
NASA TLX SUBJECTIVE SCALE
Mental Demand^-	NT $>$ PT $>$ CT $=$ MT^∗∗	$10.800$	$0.013$
Physical Demand^-	not significant	$5.634$	$0.131$
Temporal Demand^-	not significant	$1.760$	$0.624$
Performance	not significant	$6.169$	$0.104$
Effort^-	PT $>$ NT $>$ MT $>$ CT^∗∗	$6.630$	$0.085$
Frustration^-	not significant	$0.667$	$0.881$
TRUST SUBJECTIVE SCALE
Competence	MT $>$ CT $>$ PT $>$ NT^∗∗	$10.663$	$0.014$
Predictability	MT $>$ CT $>$ PT $>$ NT^∗∗	$19.469$	$<0.001$
Reliability	MT $>$ CT $>$ PT $>$ NT^∗	$7.478$	$0.058$
Faith	MT $>$ CT $>$ PT $>$ NT^∗∗	$15.138$	$0.002$
Overall Trust	MT $>$ CT $>$ PT $>$ NT^∗∗	$18.210$	$<0.001$
Accuracy	MT $>$ CT $>$ PT $>$ NT^∗∗	$10.590$	$0.014$
INTERACTION SUBJECTIVE SCALE
Teammate’s Intent	MT $>$ CT $>$ PT $>$ NT^∗∗	$9.923$	$0.019$
Teammate’s Action	MT $>$ CT $>$ NT $>$ PT^∗∗	$8.040$	$0.045$
Task Progress	MT $>$ CT $>$ PT $>$ NT^∗	$6.532$	$0.088$
Robot Status	MT $>$ CT $>$ PT $>$ NT^∗∗	$15.593$	$0.001$
Information Clarity	CT $>$ MT $>$ PT $>$ NT^∗∗	$8.414$	$0.038$
PERFORMANCE OBJECTIVE SCALE
Points Scored	not significant	$3.444$	$0.328$

Table 2: Ranking scores, in the transparency user study, based on the Borda count. The gray cells indicate the leading scenario for each type of ranking.

Borda Count	NT	CT	PT	MT
Based on Collected Data Ranking (Table 1)	22	63.5	38	76.5
Based on Preference Data Ranking (Fig. 13)	16	40	29	55

Table 3: Results with relationships between communication modes. The relationship are based on mean ranks obtained through Friedman Test. The symbol ^∗ denotes significant difference (

p<0.05

) and the symbol ^∗∗ denotes marginally significant difference (

p<0.10

). The symbol ^- denotes negative scales and lower ranking is a good ranking.

SART SUBJECTIVE SCALE
Attributes	Relationship	$\chi^{2}(3)$	$p$ -value
Instability of Situation^-	NC $>$ DC $>$ IC $>$ MC^∗∗	$29.105$	$<0.001$
Complexity of Situation^-	NC $>$ IC $>$ DC $>$ MC^∗∗	$14.921$	$0.002$
Variability of Situation^-	NC $>$ DC $>$ IC $>$ MC^∗∗	$9.280$	$0.026$
Arousal	MC $>$ DC $>$ IC $>$ NC^∗∗	$28.240$	$<0.001$
Concentration of Attention	MC $>$ DC $>$ IC $>$ NC^∗∗	$24.570$	$<0.001$
Spare Mental Capacity	MC $>$ DC $>$ IC $>$ NC^∗∗	$23.579$	$<0.001$
Information Quantity	not significant	$3.286$	$0.350$
Information Quality	not significant	$4.168$	$0.244$
Familiarity with Situation	MC $>$ DC $>$ IC $>$ NC^∗∗	$12.282$	$0.006$
NASA TLX SUBJECTIVE SCALE
Mental Demand^-	NC $>$ IC $>$ DC $>$ MC^∗∗	$21.023$	$<0.001$
Physical Demand^-	NC $>$ IC $>$ DC $>$ MC^∗∗	$14.870$	$0.002$
Temporal Demand^-	NC $>$ IC $>$ DC $>$ MC^∗∗	$17.433$	$0.001$
Performance	MC $>$ DC $>$ IC $>$ NC^∗∗	$12.429$	$0.006$
Effort^-	NC $>$ IC $>$ DC $>$ MC^∗∗	$25.093$	$<0.001$
Frustration^-	NC $>$ IC $>$ DC $>$ MC^∗∗	$9.961$	$0.019$
TRUST SUBJECTIVE SCALE
Competence	MC $>$ DC $>$ IC $>$ NC^∗∗	$23.195$	$<0.001$
Predictability	MC $>$ IC $>$ DC $>$ NC^∗∗	$16.059$	$0.001$
Reliability	MC $>$ IC $>$ DC $>$ NC^∗	$6.861$	$0.076$
Faith	MC $>$ DC $>$ IC $>$ NC^∗∗	$13.425$	$0.004$
Overall Trust	MC $>$ DC $>$ IC $>$ NC^∗∗	$17.396$	$0.001$
Accuracy	MC $>$ DC $>$ IC $>$ NC^∗∗	$16.171$	$0.001$
INTERACTION SUBJECTIVE SCALE
Teammate’s Intent	MC $>$ DC $>$ IC $>$ NC^∗∗	$19.848$	$<0.001$
Teammate’s Action	MC $>$ DC $>$ IC $>$ NC^∗∗	$21.258$	$<0.001$
Task Progress	MC $>$ DC $>$ IC $>$ NC^∗∗	$13.176$	$0.004$
Robot Status	MC $>$ IC $>$ DC $>$ NC^∗∗	$13.991$	$0.003$
Information Clarity	MC $>$ IC $>$ DC $>$ NC^∗∗	$25.160$	$<0.001$
PERFORMANCE OBJECTIVE SCALE
Points Scored	not significant	$3.444$	$0.328$

Table 4: Ranking scores, in the communication user study, based on the Borda count. The gray cells indicate the leading scenario for each type of ranking.

Borda Count	NC	DC	IC	MC
Based on Collected Data Ranking (Table 3)	24	67	53	96
Based on Preference Data Ranking (Fig. 15)	16	38	30	56

4.2 Analysis and Discussion

4.2.1 Collected Data

Transparency Data. Table 1 shows the summarized results for all the subjective scales and the objective performance. We used the Friedman test friedman1937use to analyze the data and assess the significance between different modes of transparency. We derived a ranking based on the mean ranks for all the attributes that showed statistical significance ( $p<0.05$ ) or marginal significance ( $p<0.10$ ). Fig. 12 shows the percentage of operators using a particular feature. Fig. 13 shows the percentage of people ranking a task based on their choice. We used the Borda count black1976partial method for calculating the overall ranking of the collected data and transparency mode usability data. We inverted the ranking of the negative scales for calculating the Borda count scores. Table 2 shows the results of the Borda count for each category.

Communication Data. Table 3 shows the summarized results of the communication user study. We analyzed the data using the Friedman test friedman1937use to assess the significant relationships among different modes of communication. We used statistical significance ( $p<0.05$ ) and marginal significance ( $p<0.10$ ) to derive a ranking based on their mean ranks. Fig. 14 shows the percentage of operators using a particular feature. Fig. 15 shows the percentage of people ranking task based on their choice. Using the Borda count method, we derived an overall ranking based on the collected data and the user preference data (shown in Table 4).We inverted the ranking of the negative scales for the Borda count scores.

4.2.2 Transparency Modes

Table 2 shows that mixed transparency (MT) is the best transparency mode in terms of usability, supporting hypotheses H ${}^{R}_{T}$ 1 and H ${}^{R}_{T}$ 2. From the results, central transparency (CT) dominates peripheral transparency (PT), supporting hypothesis H ${}^{R}_{T}$ 3. In addition to this, we also analyzed the modes of transparency based on the sub-scales of the subjective data and further analysed for each mode as follows.

Mixed Transparency. This mode is the overall best choice for the operators. The results suggest that this mode provides the operators with the best situational awareness, measured in terms of least instability of situation, complexity of situation, best information arousal, level of concentration, information quality, and information quantity. Through this transparency mode, the operators had the most information about actions and intentions of teammates and robots, as well as of the task progress. This led the operators to report the highest trust across all trust sub-scales.

Central Transparency. This mode is the second best choice after mixed transparency. The operators had the best familiarity and clarity in terms of information provided by the interface. The operators experienced the lowest mental load and reported the least effort in performing the task. Fig. 12 supports these findings as 92% (13 out of 14 operators) indicated the on-robot status as the most useful feature.

Peripheral Transparency. The operators reported peripheral transparency as the most cumbersome mode. The operators experienced the lowest awareness, which caused degraded trust. The operators reported that the mode was merely better than no transparency (NT), because the presence of some information is still better than no information.

Comparison with Proximal Interaction. Overall, the conclusions of this study are in line those we reported for proximal interaction. However, the results in this paper are more substantial compared to what we observed for proximal interaction. Unlike proximal interaction, mixed transparency in remote interaction was the clear winner, both from the collected data ranking and the preference data ranking (see Table. 2). Central transparency not only outperformed peripheral transparency in remote interaction, but dominated the results when compared to the findings of the study with proximal interaction. We speculate that this difference is due to the fact that, in proximal interaction, the operators had to devote effort to avoid bumping into robots and other operators while walking. This made the operators alert and anxious, affecting their focus on the information offered by the interface and the transparency modes. In remote interaction, as there was no need to physically move, the operators could focus on the displayed information more effectively.

Our experiments did not reveal a substantial difference in performance across transparency modes. We hypothesize that this lack of difference is due to the learning effect across the four runs that each team had to perform. Fig. 16 shows the performance in each task and Fig. 17 reports the increase in performance due to the task order (learning effect). As most of the teams were able to complete the task in less than 8 minutes, Fig. 18 shows the decrease in time taken to complete the task in order of the performed task, indicating the impact of the learning effect.

4.2.3 Modes of Communication

Table 4 suggests that mixed communication (MC) is the best mode of communication, both in terms of usability preference and in terms of the data collected during the user study, supporting hypotheses H ${}^{R}_{C}$ 1 and H ${}^{R}_{C}$ 2. In addition, direct communication (DC) outperformed indirect communication (IC), confirming hypothesis H ${}^{R}_{C}$ 3. We also analysed the modes of communication based on the sub-scales of the subjective data and further analysed for each mode.

Mixed Communication. Mixed communication was recognized as the best mode, not only based on the Borda count but also looking at the results of the subjective data. This mode had the best situational awareness, trust in the system, and interaction with the robots and the operator, while having the lowest task load.

Direct Communication. This mode was the second best. It outperformed indirect communication in terms of information awareness and communication with the other operator (operator-level information), resulting in better trust in the system and lower workload with respect to indirect communication.

Indirect Communication. This mode was the third best choice. This mode proved to be better in conveying robot-level information, thus allowing the operator to better understand and predict robot actions, when compared to direct communication. This made the operators trust this mode more in terms of predictability and reliability, but at the cost of experiencing higher workload in comparison to mixed communication and direct communication.

Comparison with Proximal Interaction. Analogously to what we said about transparency, these observations are in line with the results of the proximal interaction study Patel2021 . However, the results of this study were more decisive with respect to the proximal interaction study. Also in this case, we observed that the proximal interaction made the operators alert and anxious about robots and the other operator. Also, as the operators had to physically walk around other robots, the interaction felt at times cumbersome. This observation is supported by workload results of the proximal interaction studies in our previous work, indicating high workload experienced in all modes of communication. In contrast, the results of workload in remote interaction showed significant difference between communication modes.

Our experiments did not reveal a significant difference in performance across communication modes. Similarly to what we discussed for transparency, we hypothesize that this lack of difference is due to the learning effect across the four runs that each team had to perform. Fig. 19 indicates the points earned by the operators in each task and Fig. 20 shows the learning effect as the increase in points earned in order of the performed task. As most of the operator teams were able to complete the task earlier than 8 minutes, Fig. 21 shows the decrease in time taken to complete the task in order of the performed task as a clear indicator of the learning effect.

5 User Study with Information Loss

The study presented so far was based on the assumption that the information flow was fast and continuous for every operator. This was possible because all the users involved in our experimental evaluation had fast, stable Internet connections that showed no issues. However, in remote operations, fast and stable connectivity cannot be taken for granted.

For this reason, we investigate the role that intermittent information flow plays in the efficiency of remote multi-human multi-robot interaction. In this paper, we measure information loss as the time elapsed between two updates of the graphical user interface. In other words, we define information loss as the inverse of the frame rate. With operators and robots in separate environments, it is likely for the operators to experience different levels of information loss. When this happens, we speak of heterogeneous information loss.

For the purposes of our study, we categorize information loss in two ranges of usability. The high usability range (U_H) corresponds to levels of information loss that cause negligible discomfort in the operators that experience it. Conversely, we are in low usability range (U_L) when the level of information loss is such that an operator cannot ignore its presence, experiencing some sort of discomfort.

In general, the exact extent of these ranges changes with the operators. We thus split our study in two parts. In the pilot study (Sec. 5.1), we investigate the extent of the usability ranges in experiments that involve a single operator. Next, in the main study (see Sec. 5.2), we turn to multiple operators and assess the effect of heterogeneous information loss, using the homogeneous case as a baseline reference.

5.1 Information Loss Pilot Study

Experimental Setup. For our pilot study with a single operator we used the game scenario presented in Sec. 4 (see Fig. 10). The operator was tasked with performing half of the game: moving 1 big object and 2 small objects. In contrast to the previous game, we set no time limit to complete the task, instead declaring completion when the required objects reached the goal region. Every participant had to perform the task 6 times with different levels of information loss each time. The levels spanned from 0 s to 2.5 s in increments of 0.5 s. To compensate for possible learning effects or other confusing factors, we determined different level orderings:

•

Increasing order: information loss increases with every task.
•

Decreasing order: the information loss decreases with every task.
•

Random 1: information loss is in the order $\{0,2.5,0.5,2,1,1.5\}$ s.
•

Random 2: the reverse order with respect to Random 1.

Participant Sample. We recruited 20 university students (7 females, 13 males) with ages ranging from 18 to 31 years old ( $22.75\pm 3.57$ ). All participants were randomly assigned one task ordering. Each participant performed the 6 tasks in the determined order. No participant had prior experience with the remote interface.

Pilot Study Procedure. Each session of the study took approximately 90 minutes. After signing the consent form, we explained the task setup and gave the participant 12 minutes to familiarize with the system. After each task, the participant had to answer a subjective questionnaire.

Metrics. We recorded the subjective and objective measures for each participant for each task. The performance of the operator was measured as time taken to complete a task. We used the NASA TLX hart1988development scale on a 10-point Likert scale to compare the perceived workload in each task. In addition to the workload questionnaire, the participants were requested to report the experienced discomfort on a 10-point Likert Scale, followed by a comment box for free-form description of the type of discomfort experienced.

Results. For each item in the NASA TLX scale, we report a significance matrix based on the Friedman test to identify the two ranges of usability. The results are shown in Tables 5-12. The green cells in these tables indicate the high usability range and the red cells indicate the low usability range. We also superimposed the usability ranges in Table 13. From our data, we estimate the high usability range between 0 s and 0.5 s, and the low usability range between 2 s and 2.5 s. For the upcoming main study on information loss (Sec. 5.2), we took the midpoints of these ranges (0.25 s and 2.25 s). Figures 22-29 report the box plots of the recorded readings for the respective metrics.

Table 5: Significance matrix for differences in performance between levels of information loss. The shaded regions indicate the two ranges of usability. The cell entries are the

p