HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: istio-envoy content-type: text/html; charset=utf-8 vary: Accept-Encoding etag: W/"7c121-FWoQBnpUjuQpDhfZiiifz+7tLGI" x-envoy-upstream-service-time: 287 x-akamai-transformed: 0 - 0 - content-encoding: gzip cache-control: must-revalidate, max-age=3600, s-maxage=31536000 date: Tue, 15 Jul 2025 17:11:31 GMT referrer-policy: strict-origin strict-transport-security: max-age=31536000; includeSubDomains x-content-type-options: nosniff x-frame-options: SAMEORIGIN set-cookie: _abck=BE6B6664F2BB5F59FA56479E5122246A~-1~YAAQbq1NaMuIefKXAQAAl/IRDw4qmlrjzDg7z2MhKu9lsfQX8m/YEPySpoNkull/1zb0hTQVW4gaaudaUezZogGIi7aJP5mUpjGOXaCVlfEAH2ScnNNW6JW9eLNcIBKZT3P2rIq8m0QfNaPw5zjSozgrBm02tD/7687sqDvdxmP9utmy4Hj+VSwVkldokc6ch7vEYvVNFs+pJ27aJipu7LwvtP2V6jMFMl0buJO6GT1l8eiJbXxSVCXe+lM/qWMyTnILiQihs4MxWwbCgThQjtSuq4RefZSoT2vR5G2bthtDDqf6m8fluWs1VQS86VHNHwEbIBJ8xThTUGFtqHXQqgQr2LxffLSntPL52lVbLGb/ZTTQKkpf9K5M8gBupoNHjzNY8T036jzfD/7vRo1qq4pG4ezsYNUPY0iO2fntrYgrzQsroH+jOTJ4ePkcabw9wNuDL8E=~-1~-1~-1; Domain=.oreilly.com; Path=/; Expires=Wed, 15 Jul 2026 17:11:31 GMT; Max-Age=31536000; Secure set-cookie: bm_sz=B6E40A30C64662C3027A722DA0D1F636~YAAQbq1NaMyIefKXAQAAl/IRDxyyzScHf01QGbaen+//6f8hsmRyzkX7OgKWHtMGkql4r7EE2H405MB0gr+OHytOke8XiSfRItR+ozMCSoUo+9fzs8xCXw8XRhR/viGbkuj/wBaZxyzjb0Sz+YbXovZC4HeDkGyIODiAFV4XQ6OkDu9mskpsaD9g7fIVDvsfbfDQUdtptDP3DSqaOnr4sPHZNQrAL4et+fyT8NMk8zjH42rFwNi9JpYBnAy+fBpfJ4GtYihblJlRMh7Qnf1/pX67FzTimaA751oWY9ZxFKKucI5EKs63+Nyue0Mmahc4rQzJXZ7AvMjdpyWrBU1+kz/SvSbKkcZ1Mxokf7er2f5Blf332VJ7JVM=~3294276~4534840; Domain=.oreilly.com; Path=/; Expires=Tue, 15 Jul 2025 21:11:30 GMT; Max-Age=14399 Building Secure and Reliable Systems [Book]

For enterprise
For government
For higher ed
For individuals
For Content Marketing

book

Building Secure and Reliable Systems

by Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield

March 2020

Intermediate to advanced

555 pages

16h 29m

English

O'Reilly Media, Inc.

Book available

Start your free trial

Related skills

Site Reliability Engineering (SRE)

Associated roles

DevOps engineer
SRE

Why We Wrote This BookWho This Book Is ForA Note About CultureHow to Read This BookConventions Used in This BookO’Reilly Online LearningHow to Contact UsAcknowledgments
On Passwords and Power DrillsReliability Versus Security: Design ConsiderationsConfidentiality, Integrity, AvailabilityConfidentialityIntegrityAvailabilityReliability and Security: CommonalitiesInvisibilityAssessmentSimplicityEvolutionResilienceFrom Design to ProductionInvestigating Systems and LoggingCrisis ResponseRecoveryConclusion
Attacker MotivationsAttacker ProfilesHobbyistsVulnerability ResearchersGovernments and Law EnforcementActivistsCriminal ActorsAutomation and Artificial IntelligenceInsidersAttacker MethodsThreat IntelligenceCyber Kill Chains™Tactics, Techniques, and ProceduresRisk Assessment ConsiderationsConclusion
Safe Proxies in Production EnvironmentsGoogle Tool ProxyConclusion
Design Objectives and RequirementsFeature RequirementsNonfunctional RequirementsFeatures Versus Emergent PropertiesExample: Google Design DocumentBalancing RequirementsExample: Payment ProcessingManaging Tensions and Aligning GoalsExample: Microservices and the Google Web Application FrameworkAligning Emergent-Property RequirementsInitial Velocity Versus Sustained VelocityConclusion
Concepts and TerminologyLeast PrivilegeZero Trust NetworkingZero TouchClassifying Access Based on RiskBest PracticesSmall Functional APIsBreakglassAuditingTesting and Least PrivilegeDiagnosing Access DenialsGraceful Failure and Breakglass MechanismsWorked Example: Configuration DistributionPOSIX API via OpenSSHSoftware Update APICustom OpenSSH ForceCommandCustom HTTP Receiver (Sidecar)Custom HTTP Receiver (In-Process)TradeoffsA Policy Framework for Authentication and Authorization DecisionsUsing Advanced Authorization ControlsInvesting in a Widely Used Authorization FrameworkAvoiding Potential PitfallsAdvanced ControlsMulti-Party Authorization (MPA)Three-Factor Authorization (3FA)Business JustificationsTemporary AccessProxiesTradeoffs and TensionsIncreased Security ComplexityImpact on Collaboration and Company CultureQuality Data and Systems That Impact SecurityImpact on User ProductivityImpact on Developer ComplexityConclusion

Why Is Understandability Important?System InvariantsAnalyzing InvariantsMental ModelsDesigning Understandable SystemsComplexity Versus UnderstandabilityBreaking Down ComplexityCentralized Responsibility for Security and Reliability RequirementsSystem ArchitectureUnderstandable Interface SpecificationsUnderstandable Identities, Authentication, and Access ControlSecurity BoundariesSoftware DesignUsing Application Frameworks for Service-Wide RequirementsUnderstanding Complex Data FlowsConsidering API UsabilityConclusion
Types of Security ChangesDesigning Your ChangeArchitecture Decisions to Make Changes EasierKeep Dependencies Up to Date and Rebuild FrequentlyRelease Frequently Using Automated TestingUse ContainersUse MicroservicesDifferent Changes: Different Speeds, Different TimelinesShort-Term Change: Zero-Day VulnerabilityMedium-Term Change: Improvement to Security PostureLong-Term Change: External DemandComplications: When Plans ChangeExample: Growing Scope—HeartbleedConclusion
Design Principles for ResilienceDefense in DepthThe Trojan HorseGoogle App Engine AnalysisControlling DegradationDifferentiate Costs of FailuresDeploy Response MechanismsAutomate ResponsiblyControlling the Blast RadiusRole SeparationLocation SeparationTime SeparationFailure Domains and RedundanciesFailure DomainsComponent TypesControlling RedundanciesContinuous ValidationValidation Focus AreasValidation in PracticePractical Advice: Where to BeginConclusion
What Are We Recovering From?Random ErrorsAccidental ErrorsSoftware ErrorsMalicious ActionsDesign Principles for RecoveryDesign to Go as Quickly as Possible (Guarded by Policy)Limit Your Dependencies on External Notions of TimeRollbacks Represent a Tradeoff Between Security and ReliabilityUse an Explicit Revocation MechanismKnow Your Intended State, Down to the BytesDesign for Testing and Continuous ValidationEmergency AccessAccess ControlsCommunicationsResponder HabitsUnexpected BenefitsConclusion
Strategies for Attack and DefenseAttacker’s StrategyDefender’s StrategyDesigning for DefenseDefendable ArchitectureDefendable ServicesMitigating AttacksMonitoring and AlertingGraceful DegradationA DoS Mitigation SystemStrategic ResponseDealing with Self-Inflicted AttacksUser BehaviorClient Retry BehaviorConclusion
Background on Publicly Trusted Certificate AuthoritiesWhy Did We Need a Publicly Trusted CA?The Build or Buy DecisionDesign, Implementation, and Maintenance ConsiderationsProgramming Language ChoiceComplexity Versus UnderstandabilitySecuring Third-Party and Open Source ComponentsTestingResiliency for the CA Key MaterialData ValidationConclusion
Frameworks to Enforce Security and ReliabilityBenefits of Using FrameworksExample: Framework for RPC BackendsCommon Security VulnerabilitiesSQL Injection Vulnerabilities: TrustedSqlStringPreventing XSS: SafeHtmlLessons for Evaluating and Building FrameworksSimple, Safe, Reliable Libraries for Common TasksRollout StrategySimplicity Leads to Secure and Reliable CodeAvoid Multilevel NestingEliminate YAGNI SmellsRepay Technical DebtRefactoringSecurity and Reliability by DefaultChoose the Right ToolsUse Strong TypesSanitize Your CodeConclusion
Unit TestingWriting Effective Unit TestsWhen to Write Unit TestsHow Unit Testing Affects CodeIntegration TestingWriting Effective Integration TestsDynamic Program AnalysisFuzz TestingHow Fuzz Engines WorkWriting Effective Fuzz DriversAn Example FuzzerContinuous FuzzingStatic Program AnalysisAutomated Code Inspection ToolsIntegration of Static Analysis in the Developer WorkflowAbstract InterpretationFormal MethodsConclusion
Concepts and TerminologyThreat ModelBest PracticesRequire Code ReviewsRely on AutomationVerify Artifacts, Not Just PeopleTreat Configuration as CodeSecuring Against the Threat ModelAdvanced Mitigation StrategiesBinary ProvenanceProvenance-Based Deployment PoliciesVerifiable BuildsDeployment Choke PointsPost-Deployment VerificationPractical AdviceTake It One Step at a TimeProvide Actionable Error MessagesEnsure Unambiguous ProvenanceCreate Unambiguous PoliciesInclude a Deployment BreakglassSecuring Against the Threat Model, RevisitedConclusion
From Debugging to InvestigationExample: Temporary FilesDebugging TechniquesWhat to Do When You’re StuckCollaborative Debugging: A Way to TeachHow Security Investigations and Debugging DifferCollect Appropriate and Useful LogsDesign Your Logging to Be ImmutableTake Privacy into ConsiderationDetermine Which Security Logs to RetainBudget for LoggingRobust, Secure Debugging AccessReliabilitySecurityConclusion
Defining “Disaster”Dynamic Disaster Response StrategiesDisaster Risk AnalysisSetting Up an Incident Response TeamIdentify Team Members and RolesEstablish a Team CharterEstablish Severity and Priority ModelsDefine Operating Parameters for Engaging the IR TeamDevelop Response PlansCreate Detailed PlaybooksEnsure Access and Update Mechanisms Are in PlacePrestaging Systems and People Before an IncidentConfiguring SystemsTrainingProcesses and ProceduresTesting Systems and Response PlansAuditing Automated SystemsConducting Nonintrusive TabletopsTesting Response in Production EnvironmentsRed Team TestingEvaluating ResponsesGoogle ExamplesTest with Global ImpactDiRT Exercise Testing Emergency AccessIndustry-Wide VulnerabilitiesConclusion
Is It a Crisis or Not?Triaging the IncidentCompromises Versus BugsTaking Command of Your IncidentThe First Step: Don’t Panic!Beginning Your ResponseEstablishing Your Incident TeamOperational SecurityTrading Good OpSec for the Greater GoodThe Investigative ProcessKeeping Control of the IncidentParallelizing the IncidentHandoversMoraleCommunicationsMisunderstandingsHedgingMeetingsKeeping the Right People Informed with the Right Levels of DetailPutting It All TogetherTriageDeclaring an IncidentCommunications and Operational SecurityBeginning the IncidentHandoverHanding Back the IncidentPreparing Communications and RemediationClosureConclusion
Recovery LogisticsRecovery TimelinePlanning the RecoveryScoping the RecoveryRecovery ConsiderationsRecovery ChecklistsInitiating the RecoveryIsolating Assets (Quarantine)System Rebuilds and Software UpgradesData SanitizationRecovery DataCredential and Secret RotationAfter the RecoveryPostmortemsExamplesCompromised Cloud InstancesLarge-Scale Phishing AttackTargeted Attack Requiring Complex RecoveryConclusion
Background and Team EvolutionSecurity Is a Team ResponsibilityHelp Users Safely Navigate the WebSpeed MattersDesign for Defense in DepthBe Transparent and Engage the CommunityConclusion
Who Is Responsible for Security and Reliability?The Roles of SpecialistsUnderstanding Security ExpertiseCertifications and AcademiaIntegrating Security into the OrganizationEmbedding Security Specialists and Security TeamsExample: Embedding Security at GoogleSpecial Teams: Blue and Red TeamsExternal ResearchersConclusion
Defining a Healthy Security and Reliability CultureCulture of Security and Reliability by DefaultCulture of ReviewCulture of AwarenessCulture of YesCulture of InevitablyCulture of SustainabilityChanging Culture Through Good PracticeAlign Project Goals and Participant IncentivesReduce Fear with Risk-Reduction MechanismsMake Safety Nets the NormIncrease Productivity and UsabilityOvercommunicate and Be TransparentBuild EmpathyConvincing LeadershipUnderstand the Decision-Making ProcessBuild a Case for ChangePick Your BattlesEscalations and Problem ResolutionConclusion

Overview

Can a system be considered truly reliable if it isn't fundamentally secure? Or can it be considered secure if it's unreliable? Security is crucial to the design and operation of scalable systems in production, as it plays an important part in product quality, performance, and availability. In this book, experts from Google share best practices to help your organization design scalable and reliable systems that are fundamentally secure.

Two previous O’Reilly books from Google—Site Reliability Engineering and The Site Reliability Workbook—demonstrated how and why a commitment to the entire service lifecycle enables organizations to successfully build, deploy, monitor, and maintain software systems. In this latest guide, the authors offer insights into system design, implementation, and maintenance from practitioners who specialize in security and reliability. They also discuss how building and adopting their recommended best practices requires a culture that’s supportive of such change.

You’ll learn about secure and reliable systems through:

Design strategies
Recommendations for coding, testing, and debugging practices
Strategies to prepare for, respond to, and recover from incidents
Cultural best practices that help teams across your organization collaborate effectively