Skip to the content.

FAQ Knowledge Base Help Docs Troubleshooting Support Docs SRE Guides English

Use Case

At the company, let’s call it “DoxHut,” they were running PoCs (Proof of Concepts) directly on production. The software solution was implemented on devices already established within the customer’s facilities. Although it was code injected into an application, it required various hardware installations and frequent maintenance.

However, they faced a challenge when it came to handling incidents. The senior SREs (Site Reliability Engineers) had a lot on their plates, leading them to onboard new team members, mainly juniors. The newcomers needed proper guidance on their day-to-day tasks, which sometimes meant the more experienced colleagues had to pause their work to assist them with “basic stuff.”

When I took on the tech writer position, I had the opportunity to get to know each team better. During interviews, I asked them about their pain points. It turned out that they had always wanted to create an FAQ for newcomers, but they never had the chance to do so. Gathering testimonies and data, I created a page in our knowledge base with frequently asked questions. After reviewing and publishing it, I shared the page with the team. This sparked discussions, leading to additional items to include, which I promptly addressed in the periodic updates I had planned for this documentation piece.

SRE FAQ

Goal

Welcome to the neatly organized section for common DevOps SREs requests! Here, you’ll find solutions to tasks like creating a user on a machine or setting up a new user, all aimed at saving valuable time for our SREs and unblocking coders more swiftly.

💡 We’re all about staying up-to-date! If you come across any outdated solutions, don’t hesitate to reach out and start a discussion on our Slack channel (#sre-faqs). Feel free to ping our technical writer, @NaguiPinetta, to request an update. We’re committed to providing the latest and greatest solutions for you!

On this Page

How To

Set up a new user on a machine:

sudo adduser <user>              # Create the new user
sudo usermod -aG sudo <user>     # Add the user to the 'sudo' group for administrative privileges
sudo gpasswd -a <user> docker    # Add the user to the 'docker' group for Docker access
sudo adduser <user> && sudo usermod -aG docker <user>

💡 While the one-liner sets up the user with sudo and Docker access, it does not grant explicit passwordless sudo permissions. If you want to provide passwordless sudo access, you’ll need to modify the sudoers file accordingly. However, please exercise caution when granting passwordless sudo access, and only do so for trusted users. Security should always be a top priority!

Generate an SSH key:

ssh-keygen -t ed25519 -C "<name>@doxhut.xyz"

💡 Remember to keep the private key secure and avoid sharing it with others. Security is crucial!

TLDR Command to Delete a User:

The userdel command is used to remove a user account or remove a user from a group in Linux systems. Please note that all commands must be executed as root.

For more information about userdel, refer to the manual page.

To remove a user:

userdel [name]
userdel --remove [name]
userdel [name] [group]
userdel --root [path/to/other/root] [name]

💡 Remember to replace [name], [group], and [path/to/other/root] with the actual username, group name, and path to the other root directory, respectively. Always exercise caution when using this command as it can result in the irreversible deletion of user data.

CVD Upload Script:

A significant change has been made to the CVD upload script, where the code has been refactored to support camera coordinates for specific Cam IDs. An example configuration in the upload script is as follows:

cam-config:
  fps: 25
  base-dimension:
  - 1280
  - 720
   origins:
    7:
    - 0
    - 0
    9:
    - 1280
    - 0


This configuration allows for specifying different camera coordinates (origins) for specific camera IDs, along with the frames per second (fps) and base dimensions (1280x720). This change enhances the flexibility and customization options for the CVD upload process.

Reinstall k3s, Set up Rabbit, and GPU Splitting:

Usage and Options:

root@dev-office-inference-0:/home/agot# k3scli.sh -h
    Usage: k3scli.sh args ...

    Description:
    Options:
        -k Uninstall and install k3s
        -r Install rabbit
        -g Setup GPU sharing
        -a Install AWS CLI
Reinstall Everything:
k3scli.sh -k -g -r


This command will effectively uninstall the current k3s version, perform a fresh installation, configure GPU sharing, and deploy Rabbit, ensuring a clean and optimized environment for your tasks.

Killing a Running Process

In case of old processes running in the background and causing slowdowns, it is essential to identify and terminate them. The following commands will help you to pinpoint the troublesome processes and responsibly terminate them.

PS Command - Information about Running Processes:

To list information on running processes, use the ps command:

ps aux
ps auxww
ps aux | grep string
ps --user $(id -u) -F
ps --user $(id -u) f
ps -o ppid= -p pid
ps --sort size

🧷 More information about the ps command can be found here.

Kill Command - Terminate a Process:

The command sends a signal to a process, usually to stop it. Choose the appropriate command based on your scenario:

💡 All signals except for SIGKILL and SIGSTOP can be intercepted by the process to perform a clean exit.

kill process_id
kill -l
kill %job_id
kill -1|HUP process_id
kill -2|INT process_id
kill -9|KILL process_id
kill -17|STOP process_id
kill -SIGUSR1 -group_id

🧷 More information about the killcommand can be foun here.

⚠️ Caution: These commands are sensitive and can lead to issues. Killing a process might affect someone else’s work. Please use these commands with care and consideration.

How to Create S3 Buckets

Follow the guidelines included in this repo.

How to install

Adding User for Automations

To set up a new user with administrative privileges for automations, follow these steps:

This command creates a new user with the username awx and sets the user’s shell to /bin/bash. The user will be added to the sudo group, granting administrative privileges.

useradd -c "User for automations" -G "sudo" -s /bin/bash -m awx
mkdir -p /home/awx/.ssh && chmod 0700 /home/awx/.ssh && touch /home/awx/.ssh/authorized_keys && chown -R awx. /home/awx/.ssh && chmod 0600 /home/awx/.ssh/authorized_keys
sudo visudo
Defaults    env_reset
Defaults    mail_badpass
Defaults    secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin"
root    ALL=(ALL:ALL) ALL
%admin  ALL=(ALL) ALL
%sudo   ALL=(ALL:ALL) NOPASSWD: ALL
See sudoers(5) for more information on "#include" directives:
includedir /etc/sudoers.d
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMOmhXTjtS4Tehalzfyn6KwPU0CwYpCSRuv2+P/bZrrc user for automation


Useful External Documentation

kubclt Reference Docs

Access kubectl official documentation.
If you are wondering how to perform a specific action inside a cluster, you can find the corresponding command by looking for the closest verb in the left pane. The verbs listed in the documentation are closely related to the actions you want to execute within Kubernetes. This can help you quickly identify the appropriate command for your task.

Useful Tools

TLDR

TLDR is a powerful application that provides concise and practical cheatsheets for various console commands. It is like TLDR RM, but with a list of the most frequently used rm commands and their explanations. You can find more information about this tool and explore its collaborative cheatsheets on GitHub - tldr-pages/tldr: 📚 Collaborative cheatsheets for console commands. TLDR can save you time and effort by presenting the most relevant information in a clear and easy-to-understand format.

This document was last updated on 06/06/2022 by Nagui Pinetta.

Back