Once you graduate from university and start in the robotics workforce, you will be exposed to a massively different world than you've encountered in your classes, educational competitions like FIRST robotics, team projects in student organizations, and even research projects at well-reknowned labs. You may be thinking, "Well, I've worked on team class projects, so I know pretty much how this will go. Some people pull their weight, some slack off, but, in the end, everything will go ok for our final report. The workforce will be pretty much like that." (Oh, how I wish to be so young and naive again!)
Don't underestimate it: the scale of the engineering effort (and its impact) in the enterprise setting will be larger than anything else you have experienced at university, the impact of the "soft skills" like teamwork and communication will be much higher, and the quality standards for your technical contributions will be higher too. It is not uncommon to feel some culture shock during this transition. Hopefully you will have had some summer internships doing real R&D to help you prepare for this experience. Or, you may be lucky enough to participate in one of the few university labs that engages in system engineering at a reasonable scale -- and by "reasonable" I mean 10+ simultaneous developers on one project. Even if you're a student with a 4.0 GPA, if you can't adapt to the complexities of systems engineering, you might end up a perpetual junior engineer bumbling your way around an organization with no hope for career advancement.
Real-world robotics engineering requires working on large and diverse teams over long periods of time. A good engineer is grounded in 1) the theory governing component algorithms, 2) system integration and development practices, and 3) effective communication skills to document, justify, and advocate for their work. Teams of engineers also need managers, and a good manager is grounded in 4) logical and organized thought about the system at appropriate abstraction levels, 5) project management skills to plan, assign, and track development progress, and 6) people skills to motivate the engineering team and convey the vision and progress to upper-level management. Both classes of employees should also bring a sense of personal investment in the project so that they stay enthusiastic as setbacks are encountered, the project scope changes, and personnel changes occur. Although a lot of these aspects cannot be taught outside of self-help books, we will be able to provide some degree of training in items 2), 3), 4), and 5) in this book.
This chapter provides a brief overview of theory, processes, project management, and organizational practices of typical robotic systems engineering projects. We can only scratch the surface of this material, as there have been many wonderful books written about systems engineering, software engineering, organizational strategy, and organizational psychology. People can be quite opinionated and passionate about these topics, and we lack hard data exploring which methods are more successful than others, so it's best not to delve too deep into any one philosophy. Nevertheless, this high-level summary should be help the aspiring robotics engineer (and engineering manager) predict the terminology, best practices, and pain points they are expected to encounter in their future career.
It is hard to define precisely what a "system" means, but for the most part we can settle on a somewhat vague meaning: a system is an artifact composed of multiple interacting components that is engineered for a defined purpose. Often (but not always) these components correspond to different physical units, computational devices, or pieces of code. The most critical aspect of this definition is that the components themselves are engineered to produce specified function by interacting with other components within the system. We can system as a network of components interacting through edges (a system diagram) and reason about operations and information flow at a more abstract level than thinking about the details of how each component is implemented. At an organizational level, we can also think about projects in terms of a timeline of implementing components, measuring their performance, or replacing old implementations with new ones.
Abstraction is the key tool we use in system engineering to manage complexity. Abstraction is also hammered home in typical computer science curricula due to the complexity of large software projects. Its purpose is a cognitive one: human brains are simply incapable of reasoning holistically about thousands or millions of lines of computer code interacting with controllers, power electronics, motors, and mechanisms. Instead, for our own benefit we must break the system into smaller components, each of which fulfills a specific function. These "functions" are our mental model of how each component behaves or should behave. The mechanism by which we achieve abstraction is called encapsulation, which means hiding details of the implementation from the external user. We should not need to know all the details by which a motion planner works in order to use it, e.g., if it uses RRT, PRM, trajectory optimization, etc. We just need to know the inputs, the outputs, and its expected peformance.
Note that in a sufficiently complex system, the components are usually also systems themselves, built out of sub-components! You may ask, why do we choose one level of abstraction over another? One could define a car as a system of tens of thousands of parts down to the last bolt, but for most purposes that is not as useful of an abstraction as defining a car as a body, frame, engine, wheels, steering, electrical system, and passenger compartment. Useful for whom? Well, the company management, engineers, factory workers, parts suppliers, certification agencies, repair shops, and customers would tend to think of different parts of the vehicle that way. Indeed, the theory, expertise, design, tooling, and operation of each of these components is specialized for their specific function.
As a systems engineer, you may welcome abstraction at times, but at others, you may struggle against it. Some possible pitfalls include:
Considering again the car example, if you are designing a sleek and streamlined body with decorative elements that you know will sell to customers, you may run into a struggle with the engine designer who can no longer fit a sufficiently beefy engine to give the customers the horsepower they desire. This is a compatibility conflict which needs clever engineering or strong management to resolve. (If you are Ferrari, your boss tells you to quiet down and design the body around the engine!)
An incorrect abstraction is one in which one's mental model of the system may not be satisfied by the implementation. As a real-world example, my lab struggled with an issue for several days during development for the Amazon Picking Challenge. We found that when we were testing at certain times of the day, our robot would start acting strange and the picking performance would drop precipitously. Then we'd test again, and everything would work fine. The culprit? The Intel RealSense cameras we had at the time would normally report RGBD data at 30 frames per second (fps) in good lighting, but then silently drop to 15 fps in poor lighting. Because the students on the team would work long into the night, they set up the perception system to work appropriately with the lower frame rate. But at the higher frame rate, some network buffers were being filled with too RGBD images, and so the perception system was processing stale data from multiple seconds in the past. The issue here was that our working mental model of the camera was a device that provided data at a consistent rate, and this abstraction was not incorrect. Perhaps we should have read the documentation better or constructed more thorough unit tests!
Leaky abstractions are a similar concept that can cause all sorts of frustration. In the Amazon Picking Challenge, the variable frame rate of the camera caused side-effects that we did not account for, as we did not carefully design the perception system in mind with all the details of the ROS communication system. This is because the publish-subscribe abstraction used by ROS is, coarsely speaking, "a publisher sends a message and immediately the subscriber(s) get it". In order to find the issue the developer needs to know more about networking than was promised -- specifically ROS queues and the slow receiver problem. Once we found the culprit, the fix was easy (shortening the queues to only provide the latest data), but placing blame on the right component was tricky. (We'll see more about how to assign blame to components later.)
An overzealous abstraction occurs when a component is designed to encapsulate too much functionality. Developers of other components would like to interact with a finer level of control over its internal functions. For example, developers of industrial robots often provide a "go-to" subroutine that does not terminate until the robot arrives at its destination (or encounters a fault). This would not be acceptable if you wished to build a collision avoidance system that could stop the robot mid-motion if an obstacle were detected in the robot's path. A similar concept is the bad abstraction, in which a component tries to do a collection of things whose grouping is poorly rationalized or cognitively complex. Bad abstractions often come from a combination of overzealous encapsulation and changing requirements. As new use cases arise, the developer adds more and more configuration parameters to customize how the component functions, leading to an unwieldy, confusing set of inputs.
An aspect of abstraction that is somewhat unique to robotics is that many upstream components must model downstream components in order to function properly. For example, state estimators, object trackers, and planners need a dynamics model to predict how the system moves. If the movement mechanisms or low-level controllers grow in complexity, the dynamics become more complex, necessitating more complex models. Similarly, increased sensor capabilities usually lead to greater complexity in observation models used in state estimation or active sensing. For this reason, as we seek to improve component performance, we usually pay the price in terms of model complexity.
The most important part of organizing a team of engineers is to build a shared mental model of what that function that system should perform, what components the system will consist of, and how those components will operate. There are many ways to build such mental models, listed in order of formality:
As a general rule, information should flow down this list toward documentation, diagrams, and specifications. As the formality of such information grows, it becomes more precise, interpretable, widely disseminated, and longer-lasting. The tradeoff is that turning information from mental information to formal knowledge takes time and effort. Keeping formal knowledge up-to-date is also more time-consuming.
TODO
TODO
TODO
TODO
Given 𝑛 components in sequence, any one of which may fail independently with probability $\epsilon$, the probability that any one of them fails is $1−(1−\epsilon)^𝑛$
Given 𝑚 (redundant) components in parallel, the probability that all of them fails is $\epsilon^𝑚$
TODO figure
Now, it should be cautioned not to directly use these equations to predict true system failure probabilities, because component failures are often not independent. Suppose that in the spirit of redundancy, we have outfitted a drone with 3 inertial measurement units (IMUs) so that we have two backups in case any one fails. Each one may fail with a probability $<$ 1%, so we should expect our system to fail with probability $<$ 0.0001%, right? Well, if the IMUs rely on GPS readings for global positioning or on Earth's magnetic field for a compass heading, all three IMUs may be susceptible to GPS denial (indoor environments, tall buildings, or jamming), GPS blackouts, and magnetic interference. Or, if the drone gets jerked around rapidly, accelerometers may saturate leading to degradation of accuracy. So, a developer should watch out for common causes of simultaneous failure.
Nevertheless, these equations give a compelling rationale for three high-level goals in system development:
You may ask, what strategies should a development team pursue to accomplish each of these goals? Addressing the first is fairly straightforward: find the "weakest links" in a sequence, and get your domain specialists to improve robustness ("harden") those components. Unit testing is an important practice to adopt here.
To address the second goal, we find that in robotics it is often impossible to reduce chains of dependencies past perception, planning, and control steps. There have been research efforts to perform "end-to-end" learning that circumvent intermediate steps, but these approahes have not yet reached the reliability and customizability of the classical sense-plan-act framework. On the other hand, we have seen major advances in perception where classical pipelines that have involved long chains of processing steps (e.g., from pixels to features, from features to parts, from parts to objects) have been replaced by deep neural networks. Also, complex planning pipelines that involve mission planning logic, task sequencing, subgoal definition, and low-level planning can be replaced with unified components, such as task-and-motion planning. It can also be helpful for an upstream planner to define a scoring function that rates terminal states rather than a single goal for a downstream planner. This is because an upstream planner can make a mistake in assigning an infeasible goal, and the downstream planner would be unable to find a solution. If, instead, the upstream planner assigns scores for possible goals (penalizing unfavorable goals) then the downstream planner has more options: it could find a less favorable but feasible solution.
Approaches for the third goal depend on whether the component's failures are a) reported by the component, b) due to random phenomena, such as sensor noise or a mechanical device wearing out, or c) systematic errors, such as sensor artifacts or an algorithm failing to find a valid solution. In cases a) and b) the replication approach simply adds duplicate units. If failures are reported (case a), then it is a simple matter of switching to a backup when a primary unit fails. If they are not detectable, then you will need a mechanism to estimate which units are failing, such as taking the median of 3 or more sensors or adding a separate anomaly detector. The median approach is an effective way of handling malfunctioning sensors which may report extreme values, since the median of 3 sensors will be one of the value from the remaining 2 functioning sensors. In case c), replication is insufficient since each unit will fail in the same way, i.e., each unit's errors would be affected by a common cause. Instead, you should implement alternative approaches that fail in different conditions than the primary approach. For example, for high-reliability scenarios like autonomous driving it is a good idea to consider implementing multiple redundant planners (e.g., generate-and-score, graph-search, sampling-based planners, and trajectory optimization) which are run simultaneously on different CPU cores. The resulting paths can be then rated and the best one chosen for execution.
Generally speaking, a robotics project will follow the four phases listed here. If the organization is lucky, these steps and phases proceed one after another without a hitch. But, I have never heard of such a case in my life, and never do expect to hear of one! We will discuss caveats to this outline below.
Product team | System integration team |
---|---|
Requirements gathering | System architecture design |
Hardware team | Perception team | Dynamics and control team | Planning team |
---|---|---|---|
Design | Calibration | System identification | Obstacle detection |
Fabrication | State estimation | Tracking control | Cost / constraint definition |
Integration | Visual perception | Control API development | Motion planning |
Modeling | 3D perception | Mission planning |
System integration team | Product team |
---|---|
System integration | User interface development |
Logger development | User interface testing |
Debugging tool development (visualization, metrics, etc) | |
Data gathering, machine learning | |
Iterative development and tuning |
Hardware team | Product team | Sales and marketing |
---|---|---|
Scaling up fabrication | Product requirement validation | Technical documentation |
Design for mass production | Certification | Marketing |
Supply chain organization | User acceptance testing | Deployment |
In reality, development will be continual both within a phase and between phases. Within a phase, there will inevitably be iterative evaluation and design as components are tested and refined, and when interacting components are upgraded. There also will be continual work between phases. (... more specifically, between phases I-III, since most robotics companies never get to a product!) Requirements will change, integration challenges will kick problems back to the component teams, data gathered from integration testing will go back to tuning and machine learning, acceptance testing may require repeating the planning phase, etc. So, even though you might worry as mechanical engineer that your job will no longer be needed after the start of Phase II, in reality you are likely to be called upon throughout the development process.
Moreover, in a later section we will describe the concept of vertical development in which teams are created early in the development process to solve Phase III problems. This is a very good idea, as it can be hard to predict all of the integration problems that will be met. User interface development is also often an afterthought in many engineering projects, but getting early results from user interface testing is another very good idea. The end user might find something confusing, or not so useful, or might even be satisfied with a partial product! Having this information at hand can drastically shape the landscape of development priorities and make the difference between success and failure within the development budget and timeline.
Project management is a "soft skill" that comes in handy both in industry as well as in academia. When proposing a project, whether in the form of a pitch meeting or a grant proposal, the person who holds the purse strings is not only going to want to hear your idea, but also evidence to support confidence that the project will be managed well. This requires giving a plan about how you will manage four key resources: time, money, people, and existing infrastructure. Now, delivering on your plan also requires day-to-day management skills. All that is written on this topic could fill volumes, but we will touch on a few key points in project management here.
The first stage of project management is project planning. In your plan, you should articulate both a vision for what you hope to achieve as well as activities that you hope will get you there. The vision is a broad statement about what you hope to achieve by the end of the project. The activities are specific steps that, if successfully executed, will achieve the vision. Usually these activities are organized around deliverables, milestones, phases, or aims. Regardless of what you call them, it is very important for these activities to be articulated in a way that builds confidence in your approach and begins to organize your team's use of key resources. SMART goals are a helpful tool for articulating these activities. The acronym SMART refers to goals that are:
An additional (and often underappreciated) consideration is whether your activities are complete and complementary. Completeness means that the vision will be achieved if each of your activities are finished successfully. Complementary means that your activities build on one another and do not duplicate effort. If you fail to articulate a complete set of activities, your audience is left to hope that your team will somehow figure something out to fill the gaps without requesting additional resources. If you fail to articulate a complementary set of activities, it sounds like you are wasting resources.
Don't underestimate how hard this is; it takes practice, experience, and deep thought to write a good plan. I have seen senior managers and tenured faculty struggle through it, resulting in projects being shuttered and grants being rejected. Some typical pitfalls are poorly articulated metrics or evaluation plans, omission of underappreciated steps (of bridging components, integration efforts, or human-machine interfaces, typically), and under-resourcing or unrealistic resourcing (e.g., we will spend $200,000 on a robot and then hire an engineer who will program the application in 6 months).
Another important part of project planning is scheduling. For small projects you can provide estimates based on your prior experience, but as projects grow in complexity and duration, you will need some tools to help understand how people, time, and equipment are allocated to the tasks that make up your activities. One key tool for planning your project schedule is a Gantt chart. These are fairly straightforward charts with a timeline, broken into periods (e.g., days, weeks, months, quarters), on the X axis and tasks on the Y axis. Task dependencies are indicated with an arrow from the completion of one task to the start of another. This might sound obvious, but it is essential that any task that depends on the completion of another starts later in the timeline!
TODO: Gantt chart example.
Another important part of scheduling is assigning tasks to time periods. In doing so, you ought to estimate how long each task will take assuming a given level of staffing, including the optimistic (best-case), realistic (average case), and pessimistic (worst-case) duration. The optimism of your scheduling should be chosen to correspond with your project sponsor's tolerance of time and budget overruns. Each task with a dependency should be scheduled after the end of the dependent tasks. The time spacing between the end of one task and the beginning of another, if any, is a margin of error that your project may tolerate.
A natural question is how long might the entire project may take? To figure this out, we need to determine the critical path through the schedule.
Critical path: the sequence of dependent tasks in a plan whose total duration is the longest.
The critical path can be determined by setting up the tasks as a weighted directed acyclic graph (DAG) and finding the longest (weighted) path. (Although an algorithm can be used to solve the problem, it is usually easy enough to find the critical path by manual inspection). Any tasks on the critical path must be completed on schedule in order for the project to be completed on time, whereas non-critical tasks are often allowed some margin of error.
Note that as stated in a traditional form, critical path analysis does not take into account resource limits (except for time). Limited resources can significantly affect the schedule of a project. Imagine that everything must be done by one person: there's only so much time in the day, so you cannot execute multiple tasks in parallel unless your effort on each task is less than 100%. To come up with a resource-limited schedule, you will need to ensure that throughout the timeline, each simultaneous task does not exceed the level of your available capacity.
TODO: Budgeting and personnel: < 10% to hardware
TODO: Personnel assignment %FTE
Every plan has some risks associated with it: the likelihood that everything goes perfectly according to plan is quite low! You may be building some untested technology, the implementation of a component might be harder than it looks, end-user testing may reveal that you haven't planned to implement a necessary feature, personnel might quit, IT infrastructure can crash, your whole country may suffer from geopolitical destabilization... We can't eliminate all sources of risk during project management, but we can mitigate their impact.
The first step to risk management is to anticipate parts of the plan that are especially risky. Steps that involve complex implementation, untested technology, end-user evaluation, and regulator certification typically require some degree of scrutiny. Once sources of risk have been identified, a project manager should consider creative solutions to de-risk the plan.
De-risking is industry jargon for reducing the likelihood or the impact of risk on a plan. There are a few ways to de-risk a plan:
Assuming you have followed these guidelines and won approval, congratulations! Your project now needs to start. The first task is for you to get your team aligned on the specific technical steps that need to be executed to complete a project goal. The first place to start is usually a requirements document, which declares technical objectives in terms of specific, measurable aspects of the deliverable. You will then (or simultaneously) write a design document that outlines the steps and timeframe for achieving those objectives.
TODO: Project tracking with Gantt chart, revisions to projecgt schedule, Kanban boards, Github projects, etc.
When developing a product there will often be teams that focus on specific components, as well as teams that integrate multiple components to fulfill specific system functions. These are, respectively, known as horizontals and verticals. This terminology follows the notion of a "tech stack" with high-level, slow components on top and low-level, fast components on the bottom (see the connection to hierarchical architectures?)
Horizontal development: development that focuses on a technical component.
Vertical development: development that focuses on integrating technical components into an overall system function or behavior.
TODO: figure showing horizontal / vertical matrix
Engineers on a horizontal team will focus on refining a component's performance. For example, an object detection team would be a horizontal one and would focus on improving detection accuracy. They will also work with members of intersecting vertical teams to ensure that their component works to implement the vertical function. These will typically be subject-matter specialists with intimate knowledge of the mechanical, electrical, algorithmic, and/or computational aspects of that component. Their performance metrics will typically involve unit testing.
Engineers on a vertical team will focus on expanding the range of functions of the system, or its operational domain. For example, in an autonomous driving company a lane changing team would be focused on producing high quality driving behavior when the vehicle needs to perform a lane change. They will often have specialists in multiple relevant horizontal teams who will work with those horizontal teams to ensure that the system function can be implemented. For example, lane changing may require specialized agent trajectory prediction and motion planning functions, so working closely with those teams should be a high priority for this vertical. In contrast, an object detection horizontal team may not need to be closely involved, since lane changing does not typically require any different object detection capabilities compared to normal driving. A vertical team's performance metrics will typically involve system testing.
It is a common pitfall, especially in smaller organizations, to assign effort only to horizontal components or only to vertical ones. Without verticals, the effort on components may not be well-targeted to produce the desired functions of the system, which leads to last-minute scrambling as product deadlines grow near. Without horizontals, development is slowed down by a lack of coherence and expertise in technical components. You may end up with a mess of code with multiple implementations of motion planners, object detectors, etc. with different APIs, coding conventions, and quality standards. In a real-world example of this, I participated on a DARPA Robotics Challenge team that was vertically oriented. The competition asked teams to develop a robot to complete 8 search-and-rescue tasks, and the theory was to have a lot of professors working on the same team, each of whom had expertise on each task. My students and I were on the ladder climbing team, another professor's lab would address valve turning, another's would address driving, etc. As it turns out, the lack of coordination between task subteams was a big handicap. Although we scored quite well on my event during the semifinals, the team as a whole didn't make it to the finals...
A concept that has gained popularity through its development at NASA and then later adopted by the U.S. Department of Defense, the EU, and the Industrial Standards Organization (ISO) is the notion of Technology Readiness Levels (TRLs).
Technical readiness level (TRL): a rating scale from 1-9 designating the maturity level of a piece of technology, ranging from the basic theoretical principles observed (TRL 1) to fully proven deployments in the operational environment (TRL 9).
Usually, university research operates at TRLs 1-4, at which point a technology is validated in a lab environment. The transition from TRL 4-6 is often accomplished in industry or applied research labs. The last stages of maturing a technology from TRL 7-9 involves product development and refinement, and is almost always accomplished in industry or government labs.
Intermediate stages of development, roughly TRL 4-7, are known as the technological Valley of Death. The reason for this is that many promising technologies are mature enough to be demonstrated in the lab, but the amount of investment required to turn them into a reliable product (known as technology translation) is often underestimated. For example, costs for safety certification of medical devices can run into the tens or hundreds of millions of dollars. This phase is also accompanied by a shift in personnel from the original inventors of the early-stage technology to a development team, and this shift may come at a loss of momentum, enthusiasm, technical expertise, or project management expertise. It may be unwise to ask a professor to start a company!
Another serious risk for any translational endeavor is improper product-market fit, since we technology developers are always enthusiastic about our technology, which leads us to have "blinders on" that prevent us from predicting whether the market (i.e., consumers) will appreciate our product. Robotics is especially susceptible to this kind of failure. The remedy to this tendency is to perform early market analysis by speaking to potential consumers, whether this would be factory owners who might purchase an intelligent automation device, or the general public who might buy a home robot. The results may be eye-opening or even damning to your idea. You may get the best results by switching development priorities, e.g., you find that a new factory robot needs to identify items in clear plastic bags. Or, you may realize that your dream is doomed, e.g., you find that the number of acceptable dishes dropped a home robot is less than 1 per month, but your best lab tests place your algorithm at 5 drops per hour. Convincing your market that their needs are irrelevant is the definition of foolishness. You might be able to convince an investor to give you money for your idea, in the long run, your customers will decide whether your business succeeds or not!
System testing (aka integration testing) evaluates whether the entire system performs its specified function according to key performance objectives. This is of course the end-goal of developing a system, but the integration process is expensive and takes a very long time. So, system engineering typically involves a large amount of unit testing, which evaluates whether an individual component performs its specified function.
If designed correctly, unit tests help developers align their efforts toward relevant goals, project managers will have a better sense of priorities for allocating effort, and the whole team develops progressively greater confidence that the system will be ready for its ultimate tests. However, unit testing takes time and can even waste development effort if the metrics are not chosen wisely to align with system-level goals, or the test cases are not chosen properly.
To perform unit testing, a developer will
For many components, we do not have a perfect idea of what the outputs should be. Instead, the developer will seek to measure performance, using the following process:
Defining good mocks is extremely important in unit testing, and can be quite challenging in robotics. Essentially, an ideal mock would emulate the outputs of any upstream components so that we can predict how our tested component will perform in practice. There are several ways of getting close to this:
Let's take an object trajectory prediction component as an example, which takes the output of an object detector as input and then extrapolates the future trajectories of detected objects. We would like to mock the object detector. For a stub, we could generate some hypothetical detections of an object moving in a straight line with some noise, and verify whether the predictor generates predictions along that line. For a replay, we would simply record the output of the object detector running on some video data. For a simulation, we would run our test by running the simulation and the object detector on the images generated by simulation. Finally, for a faked simulation, we would skip generating images in simulation, and instead build a fake object detector that reads objects directly from the simulation's objects.
In addition to analyzing the expected performance of a component under realistic inputs, it is often helpful to analyze the upper limit of performance of a component under "ideal" inputs. This process is known as headroom analysis. The reason why headroom analysis is employed is to help inform development priorities. Suppose component A takes input from component B and has a performance metric M, and we are deciding whether to invest in an alternative implementation A'. However, the inputs to A' would require us to modify component B to B' or add an additional transformation layer C that would process B's outputs. Instead of implementing these changes (and then potentially having to roll them back if A' doesn't work as well as desired), we can first perform headroom analysis by defining mocks for A' simulating an ideal inputs according to a hypothetical implementation of B'. We can also simplify our implementation of A' to avoid challenging or pathological inputs. If the metric result M' in headroom analysis does not improve significantly on M, then it is not worth investing in implementing full versions of A' and B'.
To perform system testing, a developer will:
Measuring performance may involve either manual observation or instrumentation of the test environment.
A system metric that is used by management to measure team or project progress is known as a key performance indicator (KPI).
There are many performance metrics used in robotic systems, and here we describe some of the most common ones in use.
Actuators / robot arms
Sensors
State estimation / SLAM
Object detection
Segmentation
Tracking
System identification
Computation time is a common metric for all planners.
Kinematic path planning
Kinodynamic path planning (in addition to kinematic path planning metrics)
Trajectory optimization
Model predictive control (in addition to control metrics)
Multi-agent path planning
Informative path planning / active sensing
Imitation learning
Reinforcement learning
Industrial robots
Autonomous vehicles
Note that for metrics that are collected over time, from many examples, or along many dimensions, they will need to be aggregated in some way to report a single scalar number.
Max error, MAE, MSE, RMSE. Confidence intervals.
Domain
Validity
Thoroughness
In- and out- of distribution
Persistent questions: are our tests representative? Are they conclusive? When do we stop testing and start redesigning? How to tell which component was responsible for poor behavior?
Dunbar's number
Waterfall: an organizational philosophy that breaks a project into sequential stages with clearly defined development objectives that must be met before proceeding to the next.
Agile: an organizational philosophy that prioritizes frequent changes to adapt to product and customer needs. It deprioritizes systematic long-term planning due to the inability to foresee precise specifications.
Code is code, right?
You couldn't be more wrong! Organizing code well is the single most important imperative of software engineering. Code must be iteratively debugged, improved upon, and maintained, and so you and others on your team will need to be able to browse files, read code, identify problem areas, and modify significant parts of your code throughout the lifetime of your project.
Here are some tips to help you start organizing your projects better:
D level: "Code scraps"
C level: Research code
B level: Legitimate module
pip install
.A level: Maintained package
pip install
.Github
Branching
Pull requests
Code review
Continual integration